This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
2/3
LangRef.rst
-
include/llvm/
-
llvm/
-
Analysis/
-
InstructionSimplify.h
-
TargetTransformInfo.h
-
TargetTransformInfoImpl.h
-
Bitcode/
-
LLVMBitCodes.h
-
CodeGen/
-
ExpandVectorPredication.h
3/3
ISDOpcodes.h
-
Passes.h
-
SelectionDAG.h
5/5
SelectionDAGNodes.h
-
IR/
-
Attributes.td
-
FPEnv.h
-
IRBuilder.h
-
IntrinsicInst.h
-
Intrinsics.td
-
MatcherCast.h
-
PatternMatch.h
-
PredicatedInst.h
-
VPBuilder.h
-
VPIntrinsics.def
-
InitializePasses.h
-
Target/
-
TargetSelectionDAG.td
-
lib/
-
Analysis/
-
InstructionSimplify.cpp
-
TargetTransformInfo.cpp
-
AsmParser/
-
LLLexer.cpp
-
LLParser.cpp
-
LLToken.h
-
Bitcode/
-
Reader/
-
BitcodeReader.cpp
-
Writer/
-
BitcodeWriter.cpp
-
CodeGen/
-
CMakeLists.txt
-
ExpandVectorPredication.cpp
-
SelectionDAG/
-
DAGCombiner.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
-
SelectionDAG.cpp
-
SelectionDAGBuilder.h
-
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetPassConfig.cpp
-
IR/
-
Attributes.cpp
-
CMakeLists.txt
-
FPEnv.cpp
-
IRBuilder.cpp
-
IntrinsicInst.cpp
-
PredicatedInst.cpp
-
VPBuilder.cpp
-
Verifier.cpp
-
Transforms/
-
InstCombine/
-
InstCombineAddSub.cpp
-
InstCombineCalls.cpp
-
InstCombineInternal.h
-
Utils/
-
CodeExtractor.cpp
-
test/
-
Bitcode/
-
attributes.ll
-
CodeGen/
-
AArch64/
-
O0-pipeline.ll
-
O3-pipeline.ll
-
ARM/
-
O3-pipeline.ll
-
Generic/
-
expand-vp.ll
-
X86/
-
O0-pipeline.ll
-
O3-pipeline.ll
-
Transforms/
-
InstCombine/
-
vp-fsub.ll
-
InstSimplify/
-
vp-fsub.ll
-
Verifier/
-
vp-intrinsics-constrained.ll
-
vp-intrinsics.ll
-
vp_attributes.ll
-
tools/
-
llc/
-
llc.cpp
-
opt/
-
opt.cpp
-
unittests/IR/
-
IR/
-
CMakeLists.txt
-
VPIntrinsicTest.cpp
-
utils/TableGen/
-
TableGen/
-
CodeGenIntrinsics.h
-
CodeGenTarget.cpp
-
IntrinsicEmitter.cpp

Differential D57504

RFC: Prototype & Roadmap for vector predication in LLVM
Changes PlannedPublic

Authored by simoll on Jan 31 2019, 3:12 AM.

Download Raw Diff

Details

Reviewers

mkuper
fhahn
rengolin
huntergr
sdesmalen
m_zuckerman
jdoerfert

Summary

Vector Predication Roadmap

This proposal defines a roadmap towards native vector predication in LLVM, specifically for vector instructions with a mask and/or an explicit vector length.
LLVM currently has no target-independent means to model predicated vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V extension and NEC SX-Aurora.
Only some predicated vector operations, such as masked loads and stores are available through intrinsics [MaskedIR]_.

Please use docs/Proposals/VectorPredication.rst to comment on the summary.

Vector Predication intrinsics

The prototype in this patch demonstrates the following concepts:

Predicated vector intrinsics with an explicit mask and vector length parameter on IR level.
First-class predicated SDNodes on ISel level. Mask and vector length are value operands.
An incremental strategy to generalize PatternMatch/InstCombine/InstSimplify and DAGCombiner to work on both regular instructions and VP intrinsics.
DAGCombiner example: FMA fusion.
InstCombine/InstSimplify example: FSub pattern re-writes.
Early experiments on the LNT test suite (Clang static release, O3 -ffast-math) indicate that compile time on non-VP IR is not affected by the API abstractions in PatternMatch, etc.

Roadmap

Drawing from the prototype, we propose the following roadmap towards native vector predication in LLVM:

1. IR-level VP intrinsics

There is a consensus on the semantics/instruction set of VP intrinsics.
VP intrinsics and attributes are available on IR level.
TTI has capability flags for VP (`supportsVP()?, haveActiveVectorLength()`?).

Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), potential integration in Clang with builtins.

2. CodeGen support

VP intrinsics translate to first-class SDNodes (`llvm.vp.fdiv.* -> vp_fdiv`).
VP legalization (legalize explicit vector length to mask (AVX512), legalize VP SDNodes to pre-existing ones (SSE, NEON)).

Result: Backend development based on VP SDNodes.

3. Lift InstSimplify/InstCombine/DAGCombiner to VP

Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes that match standard vector IR and VP intrinsics.
Add a matcher context to PatternMatch and context-aware IR Builder APIs.
Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular vector instructions.
Incrementally lift InstCombine/InstSimplify to operate on VP as well as regular IR instructions.

Result: Optimization of VP intrinsics on par with standard vector instructions.

4. Deprecate llvm.masked.* / llvm.experimental.reduce.*

Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
DCE transitional APIs.

Result: VP has superseded earlier vector intrinsics.

5. Predicated IR Instructions

Vector instructions have an optional mask and vector length parameter. These lower to VP SDNodes (from Stage 2).
Phase out VP intrinsics, only keeping those that are not equivalent to vectorized scalar instructions (reduce, shuffles, ..).
InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) has laid the groundwork).

Result: Native vector predication in IR.

References

.. [MaskedIR] llvm.masked.* intrinsics, https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics
.. [EvlRFC] Explicit Vector Length RFC, https://reviews.llvm.org/D53613

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	3,100 ms	LLVM.CodeGen/Mips/msa::Unknown Unit Message ("")
	6,110 ms	LLVM.CodeGen/Mips/msa::Unknown Unit Message ("")
	7,500 ms	LLVM.CodeGen/Mips/msa::Unknown Unit Message ("")
	5,700 ms	LLVM.CodeGen/Mips/msa::Unknown Unit Message ("")
	50 ms	LLVM.Transforms/InstCombine::Unknown Unit Message ("")
		View Full Test Results (6 Failed)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

rengolin added subscribers: Ayal, hsaito.Feb 14 2019, 2:57 AM

cameron.mcinally added a subscriber: cameron.mcinally.Feb 25 2019, 10:08 AM

samparker added a subscriber: samparker.Feb 27 2019, 1:24 AM

SjoerdMeijer added a subscriber: SjoerdMeijer.Mar 7 2019, 11:41 AM

chill added a subscriber: chill.Mar 16 2019, 2:01 AM

re-based onto master

Herald added subscribers: nhaehnle, jvesely. · View Herald TranscriptMar 19 2019, 12:10 AM

Harbormaster completed remote builds in B29336: Diff 191252.Mar 19 2019, 12:12 AM

mcberg2017 added a subscriber: mcberg2017.Mar 19 2019, 4:30 PM

alexsusu added a subscriber: alexsusu.Mar 23 2019, 12:39 PM

dmgreen added a subscriber: dmgreen.Mar 28 2019, 10:00 AM

vchuravy added a subscriber: vchuravy.Apr 5 2019, 5:44 AM

Updates

added constrained fp intrinsics (IR level only).
initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.

Cross references

llvm.experimental.reduce.* (https://reviews.llvm.org/D60261 and/or https://reviews.llvm.org/D60262) - VP reduction signatures should track what comes out of that RFC.
SVE type support (https://reviews.llvm.org/D32530) - VPBuilder has to be made compatible with SVE types (it uses a static vector length atm).

Harbormaster completed remote builds in B30614: Diff 195366.Apr 16 2019, 6:39 AM

In D57504#1468510, @simoll wrote:

Updates

added constrained fp intrinsics (IR level only).

initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.
Do we have enough upside in having both?

In D57504#1469354, @hsaito wrote:

In D57504#1468510, @simoll wrote:

Updates

added constrained fp intrinsics (IR level only).

initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

Do we have enough upside in having both?

I see no harm in having both since we already add the infrastructure in LLVM-VP to abstract away from specific instructions and/or intrinsics. Once (if ever) exception, rounding mode become available for native instructions (or can be an optional tag-on like fast-math flags), we can deprecate all constrained intrinsics and use llvm.vp.fdiv, etc or native instructions instead.

In D57504#1469847, @simoll wrote:

In D57504#1469354, @hsaito wrote:

In D57504#1468510, @simoll wrote:

Updates

added constrained fp intrinsics (IR level only).

initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

Do we have enough upside in having both?

I see no harm in having both since we already add the infrastructure in LLVM-VP to abstract away from specific instructions and/or intrinsics. Once (if ever) exception, rounding mode become available for native instructions (or can be an optional tag-on like fast-math flags), we can deprecate all constrained intrinsics and use llvm.vp.fdiv, etc or native instructions instead.

There is an indirect harm in adding more intrinsics with partially-redundant semantics: writing transformations and analyses requires logic that handles both forms. I recommend having fewer intrinsics where we can have fewer intrinsics.

In D57504#1469847, @simoll wrote:

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

Then, please propose one more rounding mode, like round.permissive or round.any.

In D57504#1470254, @hfinkel wrote:

In D57504#1469847, @simoll wrote:

In D57504#1469354, @hsaito wrote:

In D57504#1468510, @simoll wrote:

Updates

added constrained fp intrinsics (IR level only).

initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

Do we have enough upside in having both?

I see no harm in having both since we already add the infrastructure in LLVM-VP to abstract away from specific instructions and/or intrinsics. Once (if ever) exception, rounding mode become available for native instructions (or can be an optional tag-on like fast-math flags), we can deprecate all constrained intrinsics and use llvm.vp.fdiv, etc or native instructions instead.

There is an indirect harm in adding more intrinsics with partially-redundant semantics: writing transformations and analyses requires logic that handles both forms. I recommend having fewer intrinsics where we can have fewer intrinsics.

Yep. If one additional generally-useful rounding mode gets rid of several partially redundant intrinsics, that would be a good trade-off.

In D57504#1469847, @simoll wrote:

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

If you use "round.tonearest" that will get you the same semantics as the non-constrained version. The optimizer assumes round-to-nearest by default.

Would it make sense to also update docs/AddingConstrainedIntrinsics.rst please?

Thanks for your feedback!

Planned

Make the llvm.vp.constrained.* versions the only fp ops in vp. Encode default fp semantics by passing fpexcept.ignore and round.tonearest.
Update docs/AddingConstrainedIntrinsics.rst to account for the fact that llvm.experimental.constrained.* is no longer the only namespace for constrained intrinsics.

In D57504#1470705, @kpn wrote:

Would it make sense to also update docs/AddingConstrainedIntrinsics.rst please?

Sure. I don't think we should match (in the API) an llvm.vp.constrained.* intrinsic as ConstrainedFPIntrinsic though.
Conceptually, an llvm.vp.constrained.* intrinsics sure is both - VPIntrinsic and ConstrainedFPIntrinsic. If the latter is used to transform them, ignoring the mask an vector len argument along the way, we'll see breakage (..in the future, once there are transforms for constrained fp).

pengfei added a subscriber: pengfei.Apr 22 2019, 6:17 PM

vkmr added a subscriber: vkmr.May 7 2019, 7:55 AM

rengolin mentioned this in D53613: RFC: Explicit Vector Length Intrinsics and Attributes.Jul 2 2019, 1:59 AM

This is a "Keepalive" message - I will get back working on LLVM-VP in October.

Herald added a reviewer: rengolin. · View Herald TranscriptAug 7 2019, 2:09 AM

Herald added subscribers: s.egerton, simoncook. · View Herald Transcript

Nice. Btw. another motivation could be std::simd. Here the overflow intrinsics exposed as builtin would allow us to provide a fast implementation of the masked variants of <simd>

sepavloff added a subscriber: sepavloff.Aug 21 2019, 9:31 AM

Picking this up again. I begin with changing the VP intrinsics as outlined before with one deviation from the earlier plan:

There will be no llvm.vp.constrained.* just llvm.vp.* and all FP intrinsics will have an exception mode and rounding mode parameter.

Herald added subscribers: lenary, hiraditya. · View Herald TranscriptOct 7 2019, 4:45 AM

This work was mentioned on the SVE discussion about predication, adding arm folks, just in case.

<<~same mail send to llvm-dev>>

Who is interested in a round table on vector predication at the '19 US DevMtg and/or would like to help organizing one? There were some proposals for related round tables on the mailing list but not all of them have a time slot yet (VPlan, SVE, complex math, constrained fp, ..). I am eyeing the Wednesday, 11:55 slot so please let me know if there is a schedule conflict i am not aware of.

Potential Topics:

Intersection with constrained-fp intrinsics and backend support (also complex arith).

Design of predicated reduction intrinsics (intersection with llvm.experimental.reduce[.v2].*).

Compatibility with SVE LLVM extension.

<Your topic here>

a.elovikov added a subscriber: a.elovikov.Oct 17 2019, 11:36 AM

rscottmanley added a subscriber: rscottmanley.Oct 18 2019, 8:07 AM

Are predicated vector instructions not just a special case of DemandedBits? Why can't we leave out the .vp. intrinsics, and just generate the predicate with DemandedBits? That way you do a predicated vector operation like so (in zig): As the example makes clear, this optimization would have to be guaranteed in order for the generated code to be correct (as the predicate avoids a divide-by-zero error).

var notzero = v != 0;
if (std.vector.any(notzero)) {

v = std.vector.select(5 / v, v, notzero);

}

In D57504#1720792, @shawnl wrote:
Are predicated vector instructions not just a special case of DemandedBits? Why can't we leave out the .vp. intrinsics, and just generate the predicate with DemandedBits? That way you do a predicated vector operation like so (in zig): As the example makes clear, this optimization would have to be guaranteed in order for the generated code to be correct (as the predicate avoids a divide-by-zero error).

var notzero = v != 0;
if (std.vector.any(notzero)) {
v = std.vector.select(5 / v, v, notzero);
}

What you describe is a workaround but not a solution for predicated SIMD in LLVM.
This approach may seem natural considering SIMD ISAs, such as x86 SSE, ARM NEON, that do not have predication.
It is however a bad fit for SIMD instruction sets that do support predicated SIMD (AVX512, ARM SVE, RISC-V V, NEC SX-Aurora).

As it turns out, it is more robust to have predicated instructions right in LLVM IR and convert them to the instruction+select pattern for SSE and friends than going the other way round.
This is what LLVM-VP proposes.

DevMtg Summary

There will be a separate RFC for the generalized pattern rewriting logic in LLVM-VP (see PatternMatch.h). We do this because it is useful for other efforts as well, eg to make the existing pattern rewrites in InstSimplify/Combine, DAGCombiner work also for constrained fp (@uweigand ) and complex arithmetic (@greened) . This may actually speedup things since we can pursue VP and generalized pattern match in parallel.
@nhaehnle brought up that the LLVM-VP intrinsics should be convenient and natural to work with. The convenience wrappers (PredicatedInstruction, PredicatedBinaryOperator) and pattern rewrite generalizations already achieve this to a large extent. Specifically, there should be no "holes" when it comes to handling the intrinsics (eg it should not be necessary to resort to lower-level APIs (VPIntrinsic) when dealing with predicated SIMD). (To take something actionable from this, i think there should be an IRBuilder<>::CreateVectorFAdd(A, B, Mask, AVL, InsertPt), returning a PredicatedBinaryOperator, which may either be an FAdd instruction (or constrained fp..) or a llvm.vp.fadd intrinsic, depending on the fp environment, mask parameter, and vector length parameter.)

In D57504#1723466, @simoll wrote:
In D57504#1720792, @shawnl wrote:
Are predicated vector instructions not just a special case of DemandedBits? Why can't we leave out the .vp. intrinsics, and just generate the predicate with DemandedBits? That way you do a predicated vector operation like so (in zig): As the example makes clear, this optimization would have to be guaranteed in order for the generated code to be correct (as the predicate avoids a divide-by-zero error).

var notzero = v != 0;
if (std.vector.any(notzero)) {
v = std.vector.select(5 / v, v, notzero);
}
What you describe is a workaround but not a solution for predicated SIMD in LLVM.
This approach may seem natural considering SIMD ISAs, such as x86 SSE, ARM NEON, that do not have predication.
It is however a bad fit for SIMD instruction sets that do support predicated SIMD (AVX512, ARM SVE, RISC-V V, NEC SX-Aurora).

As it turns out, it is more robust to have predicated instructions right in LLVM IR and convert them to the instruction+select pattern for SSE and friends than going the other way round.
This is what LLVM-VP proposes.

+1 on what Simon said.

There are lots of peeps like:

select ?, X, undef -> X

If we optimize away the select, we could end up incorrectly trapping on the no longer masked bits of X. This would be bad for the constrained intrinsics.

But also in the general case, it's very hard to keep a select glued to an operation through opt and llc.

In D57504#1724196, @cameron.mcinally wrote:

+1 on what Simon said.

+1.

In D57504#1723586, @simoll wrote:

DevMtg Summary

There will be a separate RFC for the generalized pattern rewriting logic in LLVM-VP (see PatternMatch.h). We do this because it is useful for other efforts as well, eg to make the existing pattern rewrites in InstSimplify/Combine, DAGCombiner work also for constrained fp (@uweigand ) and complex arithmetic (@greened) . This may actually speedup things since we can pursue VP and generalized pattern match in parallel.

I'd like to rant a little bit to see if anyone agrees with my probably unpopular opinion...

Code explosion is the symptom, not the sickness. It's caused by using experimental intrinsics. Experimental intrinsics are a detriment to progress. They end up creating a ton more work and are designed to be inevitably replaced.

I do understand the desire for these intrinsics by some -- i.e. devs that don't care about these new features aren't impacted. I think that's short sighted though. Hiding complexity behind utility functions will be painful when debugging tough problems. And updating existing code to use pattern matchers is a lot of churn -- probably more churn than making the constructs first-class citizens.

IMHO, we'd be better off baking these new features into LLVM right from the start. These 3 topics are fairly significant features. It would be hard to argue that any one will go out of style in the foreseeable future...

In D57504#1724217, @cameron.mcinally wrote:

In D57504#1723586, @simoll wrote:

DevMtg Summary

There will be a separate RFC for the generalized pattern rewriting logic in LLVM-VP (see PatternMatch.h). We do this because it is useful for other efforts as well, eg to make the existing pattern rewrites in InstSimplify/Combine, DAGCombiner work also for constrained fp (@uweigand ) and complex arithmetic (@greened) . This may actually speedup things since we can pursue VP and generalized pattern match in parallel.

I'd like to rant a little bit to see if anyone agrees with my probably unpopular opinion...

Code explosion is the symptom, not the sickness. It's caused by using experimental intrinsics. Experimental intrinsics are a detriment to progress. They end up creating a ton more work and are designed to be inevitably replaced.

Actually, the idea behind the generalized pattern code is to offer a way to gradually transition from intrinsics to native instruction support without disturbing transformations. The pattern matcher is templatized to match the intrinsics first (through utility classes). When the transition to native IR support is complete, one template-instantiation of the pattern rewriter gets dropped and the code bloat is undone. Eg in the case of VP, eventually PatternMatch will only ever be instantiated for the PredicatedContext and no longer for the special case of the EmptyContext. However, initially (in this patch) the pattern matcher is still instantiated for both kinds of context.
We can use the same mechanism to lift existing optimizations to complex arithmetic intrinsics. In that case, the matcher context would require that all constituent operations are complex number operators. The builder consuming the context will emit complex operations.

I do understand the desire for these intrinsics by some -- i.e. devs that don't care about these new features aren't impacted. I think that's short sighted though. Hiding complexity behind utility functions will be painful when debugging tough problems. And updating existing code to use pattern matchers is a lot of churn -- probably more churn than making the constructs first-class citizens.

Sure. You know its tempting to just duplicate all OpCodes (Opcodes v2) and redesign them (all of them..) to support all of this from the start: a) masking (also for scalar ops), b) an active vector length, c) constrained fp.
If you want the existing transformations to work with OpCodes v2 , you'd need exactly the same pattern generalizations, btw. In the end, whether its native instructions or intrinsics does not matter that much.

IMHO, we'd be better off baking these new features into LLVM right from the start. These 3 topics are fairly significant features. It would be hard to argue that any one will go out of style in the foreseeable future...

I'd say LLVM is long past the starting line. If we just turn on predication on regular IR instructions, many existing instruction transformations will break. You'd need one monster commit that does the switch and fixes all these transformations at the same time.

In D57504#1724217, @cameron.mcinally wrote:

Code explosion is the symptom, not the sickness. It's caused by using experimental intrinsics. Experimental intrinsics are a detriment to progress. They end up creating a ton more work and are designed to be inevitably replaced.

I think this is a big hammer argument for a nuanced topic.

We have used experimental intrinsics for a large number of disparate concepts, from exception handling to fuzzy vector extensions, and then after the semantics was defined and accepted, we baked the concepts into IR.

This is a proven track, and predication is a very similar example to past experiences, I see no contradiction here.

IMHO, we'd be better off baking these new features into LLVM right from the start. These 3 topics are fairly significant features. It would be hard to argue that any one will go out of style in the foreseeable future...

The risk of getting it wrong and having to re-bake into IR is high. We've done that with exception handling before and it wasn't pretty.

Predication is already in native IR form, albeit complex and error prone. The nuances across targets are too many to have a simple implementation working for everyone, and having a concrete implementation of the idea in intrinsic form may help clear up the issues before we stick to anything.

It's quite possible, and I really hope, that only a few targets will actually implement them, and that will be enough, so intrinsics will be short lived. Meanwhile, previous IR patterns will still match, so nothing is lost.

Of course, as with any intrinsic, it's quite possible that it will "just work" and people will give up half-way through. But history has shown that more often than not, these group efforts finish with a reasonable implementation, better than what we had before.

cheers,
--renato

+1 to what Renato said, I like this direction!
FWIW: we are working on Arm's M-profile Vector Extension (MVE), another vector extension for which this is very useful.

In D57504#1724902, @rengolin wrote:

In D57504#1724217, @cameron.mcinally wrote:

Code explosion is the symptom, not the sickness. It's caused by using experimental intrinsics. Experimental intrinsics are a detriment to progress. They end up creating a ton more work and are designed to be inevitably replaced.

I think this is a big hammer argument for a nuanced topic.

That's fair. But we can talk specifics too. We already have a lot of this functional and optimized in Clang. E.g.:

#include <stdio.h>

#pragma STDC FENV_ACCESS ON

void foo(double a[], double b[]) {
  double res[8];
  for(int i = 0; i < 8; i++)
    if (b[i] != 0.0)
      res[i] = a[i] / b[i];

  printf("%f\n", res[0]);
}

vmovupd (%rsi), %zmm0           #  test.c:8:9
vxorpd  %xmm1, %xmm1, %xmm1     #  test.c:8:14
vcmpneqpd       %zmm1, %zmm0, %k1 #  test.c:8:14
vmovupd (%rdi), %zmm1 {%k1} {z} #  test.c:9:16
vdivpd  %zmm0, %zmm1, %zmm0 {%k1} #  test.c:9:21
vmovupd %zmm0, (%rsp) {%k1}     #  test.c:9:14
vmovsd  (%rsp), %xmm0           #  test.c:11:18

That said, there's a large amount of technical debt from carrying these changes locally. I'd like to get out of that debt. That's why I'd like to avoid the experimental intrinsics detour.

I will also note that we care about a limited number of targets. So to be fair, take that into consideration.

We have used experimental intrinsics for a large number of disparate concepts, from exception handling to fuzzy vector extensions, and then after the semantics was defined and accepted, we baked the concepts into IR.

This is a proven track, and predication is a very similar example to past experiences, I see no contradiction here.

That's a fair argument too. I wasn't monitoring those projects, so I don't know the specifics.

Predication, Complex, and FPEnv require a massive amount of intrinsics to work though. Pretty much duplicating every operator (and target specific intrinsic for FPEnv). And probably some others I've forgotten. That seems like an unreasonable amount of intrinsics to me. But if others with experience in experimental intrinsics think it's manageable, I can't really argue.

IMHO, we'd be better off baking these new features into LLVM right from the start. These 3 topics are fairly significant features. It would be hard to argue that any one will go out of style in the foreseeable future...

The risk of getting it wrong and having to re-bake into IR is high. We've done that with exception handling before and it wasn't pretty.

Predication is already in native IR form, albeit complex and error prone. The nuances across targets are too many to have a simple implementation working for everyone, and having a concrete implementation of the idea in intrinsic form may help clear up the issues before we stick to anything.

It's quite possible, and I really hope, that only a few targets will actually implement them, and that will be enough, so intrinsics will be short lived. Meanwhile, previous IR patterns will still match, so nothing is lost.

Of course, as with any intrinsic, it's quite possible that it will "just work" and people will give up half-way through. But history has shown that more often than not, these group efforts finish with a reasonable implementation, better than what we had before.

Another good argument. And this isn't really the hill I want to die on. But it just seems silly to me to implement something twice: Occam's razor. We'll have to work the kinks out somewhere -- so why not push directly to the goal...

In D57504#1725429, @cameron.mcinally wrote:

But it just seems silly to me to implement something twice: Occam's razor. We'll have to work the kinks out somewhere -- so why not push directly to the goal...

I see where you're coming from, but hindsight is 20/20. Implementing something twice, when the first one is a prototype means you can make a lot of mistakes on the first iteration.

If the cost of changing the IR outweighs the prototyping costs (it usually does), than the overall cost is lower, even if for a longer period.

The current proposals are interlinked, so I don't think there will be combinatorial explosion, or even multiplication of intrinsics. I hope that we'll figure out the best way to represent that into IR sooner because of that.

This is not the first time that we try to get those into IR proper, either. All previous times we started with "change the IR" approach and could never get into agreement.

Intrinsics give us the prototype route: low implementation cost, low impact, easy to clean up later. It does add clutter in between, but that impact can also be limited to one or two targets of the willing sub-communities.

LLVM is a very fast moving target, stopping the world to get the IR "right" doesn't work.

A good example to look for is the scalable vector IR changes that have gone through multiple attempts and are going on for many years and still not complete...

These things take time, rushing it usually backfires. :)

In D57504#1725445, @rengolin wrote:

LLVM is a very fast moving target, stopping the world to get the IR "right" doesn't work.

A good example to look for is the scalable vector IR changes that have gone through multiple attempts and are going on for many years and still not complete...

These things take time, rushing it usually backfires. :)

Ha, yeah. All good points. I'll let this drop...

Updates

Fixed several intrinsic attributes.
All fp intrinsics are constrained (identically to the llvm.contrained.* ones). They behave like regular fp ops if fpexcept.ignore is passed.
Bitcode verifier test.

Observations

When using fpexcept.ignore, the fp callsites should have the readnone attribute set on them to override the inaccessiblememonly of the intrinsic declarations. That way DCE still works.
The rules for constrained fp (strictfp on the function definition, only constrained fp in that function) apply only if there is a single fp op with exceptions in the function. That is strictfp is not necessary when all fp ops have fpexcept.ignore.
When the exception behavior is not fpxcept.ignore, the fp op of the intrinsic is not revealed (getFunctionalOpcode(..) returns Call in that case).
(FIXME) NoCapture does not work on vectors of pointers.

Next steps

As mentioned earlier, generalized pattern matching will be part of a separate RFC (although its still included in this reference implementation).
I'd like to discuss the actual intrinsic signatures next. For that i will upload a new minimal patch for integer intrinsic support.

Harbormaster completed remote builds in B40198: Diff 226913.Oct 29 2019, 9:39 AM

Hi Simon, I went through the code for the first time, and this is a first round of proper nitpicks from my side. Please ignore if you want to focus on the bigger picture at this point in the discussion, but these are just some things I noticed. General nitpick is that you should run clang-format as there are quite a few coding style issues: indentation, indentation of arguments, exceeding 80 columns, placement of * and & in arguments and return values, etc. And find some more nitpicks inlined.

llvm/include/llvm/CodeGen/ISDOpcodes.h
516	I was unfamiliar with this one... I think I know what it does, and how it is different from VP_SELECT, but for clarity, can you define what `integer pivot` is?
519	typo: hether
1198	just spell out 'otherwise' here, and also below.
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
711	Perhaps outdated comment? Should it be something along the lines of 'vector predicated node' i.s.o. explicit vector lenght node?
1479	indentation of `\|\|` off by 1?
2344	`VP_LOAD` and `VP_STORE`?
2373	same?
2413	`.. does a truncation before store` sounds a bit odd. Since 'truncating store' is a well known term, and that you explain what it is for ints/floats below, I think it suffices to say "Return true if this is truncating store. For intergers ..."

In D57504#1725554, @simoll wrote:

Observations

When using fpexcept.ignore, the fp callsites should have the readnone attribute set on them to override the inaccessiblememonly of the intrinsic declarations. That way DCE still works.

Wouldn't that allow the call to be moved relative to other calls? Specifically, we need to make sure intrinsics aren't moved relative to calls that change the rounding mode. The "inaccessiblememonly" attribute is meant to model both the reading of control modes and the possible setting of status flags or raising of exceptions"

The rules for constrained fp (strictfp on the function definition, only constrained fp in that function) apply only if there is a single fp op with exceptions in the function. That is strictfp is not necessary when all fp ops have fpexcept.ignore.

I don't think this is right. Even if there are no constrained FP operations in the function we might have math library calls for which the strictfp attribute is needed to prevent libcall simplification and constant folding that might violate the rounding mode.

In D57504#1726355, @andrew.w.kaylor wrote:

In D57504#1725554, @simoll wrote:

Observations

When using fpexcept.ignore, the fp callsites should have the readnone attribute set on them to override the inaccessiblememonly of the intrinsic declarations. That way DCE still works.

Wouldn't that allow the call to be moved relative to other calls? Specifically, we need to make sure intrinsics aren't moved relative to calls that change the rounding mode. The "inaccessiblememonly" attribute is meant to model both the reading of control modes and the possible setting of status flags or raising of exceptions"

The rules for constrained fp (strictfp on the function definition, only constrained fp in that function) apply only if there is a single fp op with exceptions in the function. That is strictfp is not necessary when all fp ops have fpexcept.ignore.

I don't think this is right. Even if there are no constrained FP operations in the function we might have math library calls for which the strictfp attribute is needed to prevent libcall simplification and constant folding that might violate the rounding mode.

I see. Since we need to model the default fp environment with these intrinsics (and this is our priority), let me make the following suggestion: VP intrinsics will have a rounding mode and exception behavior argument from the start but the only allowed values are "round.tonearest" and "fpexcept.ignore". Once we have a solution for the general case implemented for constraint fp, we will unlock that feature also for LLVM VP.

In D57504#1726007, @SjoerdMeijer wrote:

Hi Simon, I went through the code for the first time, and this is a first round of proper nitpicks from my side. Please ignore if you want to focus on the bigger picture at this point in the discussion, but these are just some things I noticed. General nitpick is that you should run clang-format as there are quite a few coding style issues: indentation, indentation of arguments, exceeding 80 columns, placement of * and & in arguments and return values, etc. And find some more nitpicks inlined.

Hi Sjoerd, thanks for you comments! i've fixed the inline nitpicks right away. I'll do a style pass for the actual commits.

In D57504#1726715, @simoll wrote:

In D57504#1726007, @SjoerdMeijer wrote:

Hi Simon, I went through the code for the first time, and this is a first round of proper nitpicks from my side. Please ignore if you want to focus on the bigger picture at this point in the discussion, but these are just some things I noticed. General nitpick is that you should run clang-format as there are quite a few coding style issues: indentation, indentation of arguments, exceeding 80 columns, placement of * and & in arguments and return values, etc. And find some more nitpicks inlined.

Hi Sjoerd, thanks for you comments! i've fixed the inline nitpicks right away. I'll do a style pass for the actual commits.

Cheers. Just curious, what are your next steps? People can correct me if I'm wrong, but my impression is that with the RFC, this prototype, the discussion at the US LLVM dev conference, there is consensus and people are on-board with the general idea and direction. There are still some discussions on e.g. the (constrained) FP part, but would it now be the time split this for example up in an separate commits, like an INT and FP part (if that makes sense), so that they can be (separately) progressed?

In D57504#1729839, @SjoerdMeijer wrote:

In D57504#1726715, @simoll wrote:

In D57504#1726007, @SjoerdMeijer wrote:

Hi Simon, I went through the code for the first time, and this is a first round of proper nitpicks from my side. Please ignore if you want to focus on the bigger picture at this point in the discussion, but these are just some things I noticed. General nitpick is that you should run clang-format as there are quite a few coding style issues: indentation, indentation of arguments, exceeding 80 columns, placement of * and & in arguments and return values, etc. And find some more nitpicks inlined.

Hi Sjoerd, thanks for you comments! i've fixed the inline nitpicks right away. I'll do a style pass for the actual commits.

Cheers. Just curious, what are your next steps? People can correct me if I'm wrong, but my impression is that with the RFC, this prototype, the discussion at the US LLVM dev conference, there is consensus and people are on-board with the general idea and direction. There are still some discussions on e.g. the (constrained) FP part, but would it now be the time split this for example up in an separate commits, like an INT and FP part (if that makes sense), so that they can be (separately) progressed?

Yes, i hope that's where things are right now ;-) I am planning to go by functional slices. Each slice comes with IR-level intrinsics, TTI support, basic lowering to standard IR, Selection DAG support and tests.

I am preparing the first patchset for integer support atm.

Slices:

Integer slice.
Memory slice.
Reduction slice.
FP (with unconstrained metadata args) slice.

Standalone patch:

Mask, VectorLength and Passthru attributes (in preparation of vector function calls).

Pending discussion/separate RFC:

Constrained FP (being able to fully optimize constrained fp intrinsics in the default fp env).
Generalized pattern match (aka optimizing VP).

Nice one!

k-ishizaka added a subscriber: k-ishizaka.Nov 4 2019, 9:46 PM

D69552: Move floating point related entities to namespace level contains the fp enum changes required for LLVM-VP. Referencing the patch here.

Fixed attribute placements, signatures, more tests, ..
This is in sync with the subpatch #1 of the integer slice (https://reviews.llvm.org/D69891).

Harbormaster completed remote builds in B40575: Diff 228052.Nov 6 2019, 6:27 AM

Integer slice patches

#1 IR-level support: https://reviews.llvm.org/D69891
#2 TTI & Legalization: <stay tuned>
#3 ISel patch: <stay tuned>

I'll update this comment as we go to keep track of the integer slice.

simoll added a child revision: D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.Nov 7 2019, 12:39 AM

Changes

VPIntrinsics.def file.
Pass vlen i32 -1 to enable all lanes with scalable vector types.
Various NFC fixes.

Harbormaster completed remote builds in B40813: Diff 228885.Nov 12 2019, 6:53 AM

Moving the discussion from the integer patch alley to the main RFC as this is about the general design of VP intrinsics.. it's about having a passthru operand (as in llvm.masked.load) and whether %evl should be a parameter of the intrinsics or modelled differently.

@SjoerdMeijer https://reviews.llvm.org/D69891#inline-636845
and if I'm not mistaken we are now discussing if undef here should be undef or a passthru

@rkruppe https://reviews.llvm.org/D69891#inline-637215
I previously felt that passthru would be nice to have for backend maintainers (including myself) but perhaps not worth the duplication of IR functionality (having two ways to do selects). However, given the differences I just described, I don't think "just use select" is workable.

Ok. I do agree that having passthru simplifies isel for certain architectures (and legal combinations of passthru value, type, and operations..) but:
VP intrinsics aren't target intrinsics: they are not supposed to be a way to directly program any specific ISA in the way of a macroassembler, like you would do with llvm.x86.* or llvm.arm.mve.* or any other. Rather, think of them as regular IR instructions. Pretend that anything we propose in VP intrinsics will end up as a feature of a first-class LLVM instructions. Based on that i figured that one VP intrinsics should match one IR instructions plus predication, nothing more.

If we had predicated IR instructions, would we want them to have a passthru operand?
The prototype shows that defining VP intrinsics with undef-on-masked-out makes it straightforward to generalize InstSimplify/InstCombine/DAGCombiner such that they can optimize VP intrinsics. If you add a passthru operand then logically VP intrinsics start to behave like two instructions: that could be made work but it would be messier as you'd have to peek through selects, etc.

@sdesmalen https://reviews.llvm.org/D69891#1750287
If we want to solve the select issue and also keep the intrinsics simple, my suggestion was to combine the explicit vector length with the mask using an explicit intrinsic like @llvm.vp.enable.lanes. Because this is an explicit intrinsic, the code-generator can simply extract the %evl parameter and pass that directly to the instructions for RVV/SXA. This is what happens for many other intrinsics in LLVM already, like masked.load/masked.gather that support only a single addressing mode, where it is up to the code-generator to pick apart the value into operands that are suited for a more optimal load instruction.

Without having heard your thoughts on this suggestion, I would have to guess that your reservation is the possibility of LLVM hoisting/separating the logic that merges predicate mask and %evl value in some way. That would mean having to do some tricks (think CodeGenPrep) to keep the values together and recognizable for CodeGen. And that's the exact same thing we would like to avoid for supporting merging/zeroing predication, hence the suggestion for the explicit passthru parameter.

That's not quite the same:
%evl is mapped to a hardware register on SX-Aurora. We cannot simply reconstitute the %evl from any given mask, if %evl is obscured it makes all operations that depend on it less efficient because we need to default to the full vector length. Now, if the select is separated from the VP intrinsic, you simply emit one select instruction (and it should be possible to hoist it back and merge it with the VP intrinsic in most cases (.. and you probably want an optimization that does that in anyway because there will be code with explicit selects even with passthru)). Besides, if the select is folded with an instruction that is subsequently simpler then that's actually an argument in favor of explicit selects: passthru makes this implicit.

In D57504#1758546, @simoll wrote:

Ok. I do agree that having passthru simplifies isel for certain architectures (and legal combinations of passthru value, type, and operations..) but:
VP intrinsics aren't target intrinsics: they are not supposed to be a way to directly program any specific ISA in the way of a macroassembler, like you would do with llvm.x86.* or llvm.arm.mve.* or any other. Rather, think of them as regular IR instructions. Pretend that anything we propose in VP intrinsics will end up as a feature of a first-class LLVM instructions. Based on that i figured that one VP intrinsics should match one IR instructions plus predication, nothing more.

If we had predicated IR instructions, would we want them to have a passthru operand?

I think that would probably be a reasonable clean-slate IR design, though I am not at all sure if it would be better. I wasn't specifically advocating for passthru operands, though. I agree that not having passthru and performing the same function in two operations can be readily pattern-matched by backends at modest effort and failing to match it has low cost. My main point was just that the existing select instruction is not sufficient as the second operation, for essentially the same reason why the VP intrinsics have an EVL argument instead of just the mask. Creating a VP equivalent of select (as already sketched in the other thread) resolves that concern just as well.

simoll marked an inline comment as done.Dec 3 2019, 4:18 AM

simoll added inline comments.

llvm/docs/LangRef.rst
15902–15919	@rkruppe [..] My main point was just that the existing select instruction is not sufficient as the second operation, for essentially the same reason why the VP intrinsics have an EVL argument instead of just the mask. Creating a VP equivalent of select (as already sketched in the other thread) resolves that concern just as well. I agree. The prototype has defined such an `llvm.vp.select` from the get-go.

rkruppe added inline comments.Dec 3 2019, 10:06 AM

llvm/docs/LangRef.rst
15902–15919	Oops, missed that / forgot about it. Sorry for the noise. Is there a reason why it's not in the "integer slice" patch? It's not integer-specific, but it seems to fit even less into the other slices.

simoll marked an inline comment as done.Dec 9 2019, 12:52 AM

simoll added inline comments.

llvm/docs/LangRef.rst
15902–15919	I wanted to keep the integer patch concise for one. Also, having played around with this for a while now, i think that the signature of `vp.select` should be: llvm.vp.select(<W x i1> %m, %onTrue, %onFalse, i32 %threshold, i32 vlen %evl) meaning that values from %onTrue are selected where %m is true and the lane index is below %threshold. %onFalse is selected otherwise. Lane indices greater-equal %evl are undef as ever. In short: there is just one "merge" operation and no more separate `vp.compose`.

Herald added a subscriber: luismarques. · View Herald TranscriptDec 9 2019, 12:52 AM

Matt added a subscriber: Matt.Dec 18 2019, 12:57 PM

kariddi added a subscriber: kariddi.Dec 26 2019, 11:44 AM

Add pivot/threshold argument to llvm.vp.select and remove llvm.vp.compose
Clarify documentation on preserved lanes
explicit vlen arg is either negative or (new requirement) less-equal-than the number of lanes of the operation.

(not sure if I should continue here or in D69891, will try here first)

Sorry for dipping out of this discussion. I.e. after our "passthru discussion", I wanted to do more homework to make sure a "separate select" would work for us, if we wouldn't miss anything, but then other work happened and never got round to this. But I am still very interested, so dipping back in :-/

If we had predicated IR instructions, would we want them to have a passthru operand?

I think that would probably be a reasonable clean-slate IR design, though I am not at all sure if it would be better.

One of the problems I had that I found it difficult to see all consequences, and answers the sort of questions asked above (also because I haven't yet spend enough time on this). For example, being explicit in IR is in general a good thing to do? So yes, why not a passthru? But then I had the same question as Robin, not sure it would be better. The other thing I would mention again is that I think convenience is a pretty strong argument too, if this is most convenient for at least two / three other architectures, then why not? But then you could argue that it is simple to patch up with a select, and we're going in circles... At least the concern Robin brought up about the select seems to be addressed with the vp.select.

In D57504#1849694, @SjoerdMeijer wrote:

(not sure if I should continue here or in D69891, will try here first)

Yep, the RFC is the right place for conceptual discussions.

Sorry for dipping out of this discussion. I.e. after our "passthru discussion", I wanted to do more homework to make sure a "separate select" would work for us, if we wouldn't miss anything, but then other work happened and never got round to this. But I am still very interested, so dipping back in :-/

Welcome back :)

If we had predicated IR instructions, would we want them to have a passthru operand?

I think that would probably be a reasonable clean-slate IR design, though I am not at all sure if it would be better.

One of the problems I had that I found it difficult to see all consequences, and answers the sort of questions asked above (also because I haven't yet spend enough time on this). For example, being explicit in IR is in general a good thing to do? So yes, why not a passthru? But then I had the same question as Robin, not sure it would be better. The other thing I would mention again is that I think convenience is a pretty strong argument too, if this is most convenient for at least two / three other architectures, then why not? But then you could argue that it is simple to patch up with a select, and we're going in circles... At least the concern Robin brought up about the select seems to be addressed with the vp.select.

Couldn't agree more. I guess we just do now know at this point.. how about we move the discussion away from "which would be better?" to "if we decide for A now and later strongly realize that B would have been the right call.. how bad a u-turn would that be?"

Changes required going from passthru to select:

IR: modernize VP with passthru to intrinsic+select
Nothing more.. since we already had to implement the select+intrinsic matching logic anyway to fuse explicit selects into passthru operands.
Dead code: all the logic for dealing with the passthru operand: PatternMatch for passthru (instcombine, instsimplify, known bits..), etc

Changes required going from select to passthru:

IR: modernize and pass 'undef' as passthru
Implement that pass from the other scenario that folds select into passthru (and all the additional logic for dealing with passthru).
Dead code: none

My point here is that no matter how we decide: explicit selects and vp intrinsics will co-exist and have to be folded/optimized. However, in the explicit-select scenario we do not have to teach LLVM about passthru operands (PatternMatch -> InstCombine, ...).
Btw, I guess that https://reviews.llvm.org/D71432 shows that op+select folding can be cleanly implemented in isel and that's also in line with my experiments for the VE target.
Regarding convenience: the IRBuilder could have, eg, a ::CreatePredicatedFAdd with an explicit (optional) passthru operand.. resulting in a VP op + select.

Couldn't agree more. I guess we just do now know at this point.. how about we move the discussion away from "which would be better?" to "if we decide for A now and later strongly realize that B would have been the right call.. how bad a u-turn would that be?"

Changes required going from passthru to select:

IR: modernize VP with passthru to intrinsic+select

Nothing more.. since we already had to implement the select+intrinsic matching logic anyway to fuse explicit selects into passthru operands.

Dead code: all the logic for dealing with the passthru operand: PatternMatch for passthru (instcombine, instsimplify, known bits..), etc

Changes required going from select to passthru:

IR: modernize and pass 'undef' as passthru

Implement that pass from the other scenario that folds select into passthru (and all the additional logic for dealing with passthru).

Dead code: none

My point here is that no matter how we decide: explicit selects and vp intrinsics will co-exist and have to be folded/optimized. However, in the explicit-select scenario we do not have to teach LLVM about passthru operands (PatternMatch -> InstCombine, ...).
Btw, I guess that https://reviews.llvm.org/D71432 shows that op+select folding can be cleanly implemented in isel and that's also in line with my experiments for the VE target.
Regarding convenience: the IRBuilder could have, eg, a ::CreatePredicatedFAdd with an explicit (optional) passthru operand.. resulting in a VP op + select.

Thanks for summarising this. Fair enough, I think this sounds like a (good) plan.
I will continue in D69891, and will leave a comment there.

Btw, I guess that https://reviews.llvm.org/D71432 shows that op+select folding can be cleanly implemented in isel and that's also in line with my experiments for the VE target.

This needs a caveat. Keeping the select glued to the operation takes some careful effort. Especially in the undef passthru case, there are a bunch of peeps that will incorrectly fold away the select. E.g. this transform from InstSimplify:

if (isa<UndefValue>(FalseVal))   // select ?, X, undef -> X
  return TrueVal;

The VP intrinsics will certainly be immune to these, but if the plan is to eventually replace the VP select intrinsics with IR selects, then this problem will need to be solved. Just a heads up...

In D57504#1851864, @cameron.mcinally wrote:
Btw, I guess that https://reviews.llvm.org/D71432 shows that op+select folding can be cleanly implemented in isel and that's also in line with my experiments for the VE target.

This needs a caveat. Keeping the select glued to the operation takes some careful effort. Especially in the undef passthru case, there are a bunch of peeps that will incorrectly fold away the select. E.g. this transform from InstSimplify:
if (isa<UndefValue>(FalseVal))   // select ?, X, undef -> X
  return TrueVal;
The VP intrinsics will certainly be immune to these, but if the plan is to eventually replace the VP select intrinsics with IR selects, then this problem will need to be solved. Just a heads up...

As Eli argued in that patch, IR like select %m, (constrained.fadd %a, %b), %passthru is not expressing a predicated vector add, and must not be selected as such. The IR semantics are unambiguously: first a full vector add is performed (with all exceptions etc. that entails, or possible UB in related cases like integer division) and then some of the resulting lanes are replaced with values from %passthru. To predicate the fadd itself, a dedicated operation/intrinsic is needed. LLVM IR does not currently (and should not) change the meaning of the regular unpredicated operations based on (some? any?) uses of the value being a select. The only thing a select (or vp.select) can do is alter the lanes of a vector after it has been computed, it cannot travel back in time to change how it was computed.

VP intrinsics are the aforementioned predicated operations: in certain lanes, no computation (which might raising FP exceptions, have UB, etc.) happens and the resulting vector has some "default value" instead. The present discussion about whether to include a %passthru argument is just about how this default value is determined. But this does not change that the operation itself is predicated, it just affects how you express e.g. the patterns that map to SVE's zeroing and merging predication.

In D57504#1851960, @rkruppe wrote:

As Eli argued in that patch, IR like select %m, (constrained.fadd %a, %b), %passthru is not expressing a predicated vector add, and must not be selected as such. The IR semantics are unambiguously: first a full vector add is performed (with all exceptions etc. that entails, or possible UB in related cases like integer division) and then some of the resulting lanes are replaced with values from %passthru. To predicate the fadd itself, a dedicated operation/intrinsic is needed. LLVM IR does not currently (and should not) change the meaning of the regular unpredicated operations based on (some? any?) uses of the value being a select. The only thing a select (or vp.select) can do is alter the lanes of a vector after it has been computed, it cannot travel back in time to change how it was computed.

VP intrinsics are the aforementioned predicated operations: in certain lanes, no computation (which might raising FP exceptions, have UB, etc.) happens and the resulting vector has some "default value" instead. The present discussion about whether to include a %passthru argument is just about how this default value is determined. But this does not change that the operation itself is predicated, it just affects how you express e.g. the patterns that map to SVE's zeroing and merging predication.

Understood. I now see that we already discussed this here in October.

Your current argument sounds like it argues for explicit passthrus. E.g.:

select %m, (vp.fadd %m, %a, %b), %zeroinitializer

On SVE, this would become something like:

movprfx z0.s, p0/z, z0.s
fadd z0.s, p0/m, z0.s, z1.s

Isn't that traveling back in time to change how the inactive elements are defined? To be true to the IR. we'd want something like:

fadd z0.s, p0/m, z0.s, z1.s
sel z0s, p0/m, z0.s, <zero_vector>

How do we justify that this case is different than the op+select->predicated_op case? Are we assuming the implicit undef on the VP intrinsic allows for it?

I'm not sure what problem you think there might be? Both code sequences do the same thing (same side effects, same final result) as the input IR they matched, right? So that's what justifies them both as valid outputs and the choice is just a matter of codegen quality. You don't even need to appeal to the vp.fadd producing undef in disabled lanes, because in the final result those lanes are zero anyway and that's all that matters. This doesn't seem fundamentally more tricky than any other isel pattern that matches multiple IR instructions to produce a more efficient combined instruction. For example, if the ARM backend selects add i32 %a, (shl i32 %b, 4) as add r0, r0, r1, lsl #4, it never materializes shl %b, 4 (not into a register, at least) but the end result is still correct.

In D57504#1852185, @rkruppe wrote:

I'm not sure what problem you think there might be? Both code sequences do the same thing (same side effects, same final result) as the input IR they matched, right?

Ah, right. That side effects are the difference. Thanks for reminding me.

So that's what justifies them both as valid outputs and the choice is just a matter of codegen quality. You don't even need to appeal to the vp.fadd producing undef in disabled lanes, because in the final result those lanes are zero anyway and that's all that matters. This doesn't seem fundamentally more tricky than any other isel pattern that matches multiple IR instructions to produce a more efficient combined instruction. For example, if the ARM backend selects add i32 %a, (shl i32 %b, 4) as add r0, r0, r1, lsl #4, it never materializes shl %b, 4 (not into a register, at least) but the end result is still correct.

Yeah, this was what I was hung up on. I didn't see the difference between something like not materializing a dead instruction and masking an inactive element. But, yeah. the side effects would not be the same.

In D57504#1851864, @cameron.mcinally wrote:
Btw, I guess that https://reviews.llvm.org/D71432 shows that op+select folding can be cleanly implemented in isel and that's also in line with my experiments for the VE target.

This needs a caveat. Keeping the select glued to the operation takes some careful effort. Especially in the undef passthru case, there are a bunch of peeps that will incorrectly fold away the select. E.g. this transform from InstSimplify:
if (isa<UndefValue>(FalseVal))   // select ?, X, undef -> X
  return TrueVal;
The VP intrinsics will certainly be immune to these, but if the plan is to eventually replace the VP select intrinsics with IR selects, then this problem will need to be solved. Just a heads up...

@hsaito and I had a discussion about this earlier today. I had the same concern, that optimizations after the vectorizer might do something to decouple the vp.select from the vp.{operation}, which could lead to the code generator not being able to create a masked operation with passthru on targets that support that and thus potentially invalidate the cost model assumptions that the vectorizer made when it generated the predicated operation. Hideki convinced me that the additional freedom from explicit dependencies gained by not having a passthru argument as part of the predicated operation was likely to be more beneficial than tight coupling. If we ever do find this to be a problem, we can do something to make the intervening optimizations less aggressive with this sort of pattern.

I also talked briefly with @craig.topper about the X86 codegen handling of this, and his off the cuff reaction was to think that we probably won't have any problem generating the desired passthru+masked instructions from separated vp.select operations.

andrew.w.kaylor added inline comments.Jan 31 2020, 2:06 PM

llvm/docs/Proposals/VectorPredication.rst
2 ↗	(On Diff #228885)	Is there any reason that some form of this document can't be committed now? We have at least enough support to claim this as a community wide proposal, right?

(This was gonna be an inline comment on D69891, but it's more of a general conceptual issue, so I decided to move it here.)

Right now, LangRef changes in D69891 describe the restriction on the EVL value as this:

The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:
0 <= %evl <= W,   where W is the vector length.

The restriction is good, but this wording doesn't specify what happens when %evl is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:

The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.
The instruction returns poison (i.e., all result lanes are poison) and all lanes are (potentially, non-deterministically) enabled regardless of the mask parameter. This is less restrictive for IR optimizations (e.g., integer vp.add can unconditionally be speculated) but still allows backends to unconditionally use SETVL-style "stripmining" instructions that are not generally consistent (across architectures) w.r.t. which lanes become active when a vector length greater than the hardware vector length is requested.
%EVLmask is undef, that's all. As consequence, lanes disabled by the %mask argument definitely stay disabled, but for other lanes (where the mask has a 1 or an undef) it's non-deterministic whether they are active. As far as I can see, this has pretty much the same implications for IR optimizations and backends (excluding hypothetical pathological architectures) but is less of a special case to specify and directly captures the diversity of hardware behavior that (presumably) motivates this restriction on EVL.

Off the cuff, I would suggest the last option.

In D57504#1853591, @rkruppe wrote:
(This was gonna be an inline comment on D69891, but it's more of a general conceptual issue, so I decided to move it here.)

Right now, LangRef changes in D69891 describe the restriction on the EVL value as this:
The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:
0 <= %evl <= W,   where W is the vector length.
The restriction is good, but this wording doesn't specify what happens when %evl is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:

The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.

The instruction returns poison (i.e., all result lanes are poison) and all lanes are (potentially, non-deterministically) enabled regardless of the mask parameter. This is less restrictive for IR optimizations (e.g., integer vp.add can unconditionally be speculated) but still allows backends to unconditionally use SETVL-style "stripmining" instructions that are not generally consistent (across architectures) w.r.t. which lanes become active when a vector length greater than the hardware vector length is requested.

%EVLmask is undef, that's all. As consequence, lanes disabled by the %mask argument definitely stay disabled, but for other lanes (where the mask has a 1 or an undef) it's non-deterministic whether they are active. As far as I can see, this has pretty much the same implications for IR optimizations and backends (excluding hypothetical pathological architectures) but is less of a special case to specify and directly captures the diversity of hardware behavior that (presumably) motivates this restriction on EVL.

Off the cuff, I would suggest the last option.

We (Libre-SoC, provisionally renamed from Libre-RISCV) are currently building a processor that supports variable-length vector operations by having each operation specify the starting register in a flat register file, then relying on VL telling it how many elements to operate on, which, when divided by the number of elements per register, directly translates to the number of registers to operate on. So, if VL is out of bounds, the instructions can overwrite registers past the end of the range assigned by the register allocator and/or trap. This would probably force use of option #1 above, at least for our processor. Our ISA design is still incomplete, so we might add (or already have) a mechanism allowing use of option #2 or #3 if there is a sufficient reason (will have to see what the rest of Libre-SoC think).

In D57504#1853671, @programmerjake wrote:

We (Libre-SoC, provisionally renamed from Libre-RISCV) are currently building a processor that supports variable-length vector operations by having each operation specify the starting register in a flat register file, then relying on VL telling it how many elements to operate on, which, when divided by the number of elements per register, directly translates to the number of registers to operate on. So, if VL is out of bounds, the instructions can overwrite registers past the end of the range assigned by the register allocator and/or trap. This would probably force use of option #1 above, at least for our processor. Our ISA design is still incomplete, so we might add (or already have) a mechanism allowing use of option #2 or #3 if there is a sufficient reason (will have to see what the rest of Libre-SoC think).

Presumably you have an efficient way to somehow force the VL into the intended range to support strip-mining of loops? The exact strategy doesn't matter, anything that avoids VL being "out of bounds" should make the other options work just fine. (Assuming there aren't other, larger problems with mapping VP operations to your ISA.)

In D57504#1853802, @rkruppe wrote:

In D57504#1853671, @programmerjake wrote:

We (Libre-SoC, provisionally renamed from Libre-RISCV) are currently building a processor that supports variable-length vector operations by having each operation specify the starting register in a flat register file, then relying on VL telling it how many elements to operate on, which, when divided by the number of elements per register, directly translates to the number of registers to operate on. So, if VL is out of bounds, the instructions can overwrite registers past the end of the range assigned by the register allocator and/or trap. This would probably force use of option #1 above, at least for our processor. Our ISA design is still incomplete, so we might add (or already have) a mechanism allowing use of option #2 or #3 if there is a sufficient reason (will have to see what the rest of Libre-SoC think).

Presumably you have an efficient way to somehow force the VL into the intended range to support strip-mining of loops? The exact strategy doesn't matter, anything that avoids VL being "out of bounds" should make the other options work just fine. (Assuming there aren't other, larger problems with mapping VP operations to your ISA.)

Yes, we do (setvl has a immediate for max VL, which needs to be calculated by the register allocator or similar), though it can be bypassed by writing directly to the VL register.

So, in that case, we should be able to use option #2 or #3, as long as the compiler doesn't write to VL by any means other than setvl.

In D57504#1853591, @rkruppe wrote:
(This was gonna be an inline comment on D69891, but it's more of a general conceptual issue, so I decided to move it here.)

Right now, LangRef changes in D69891 describe the restriction on the EVL value as this:
The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:
0 <= %evl <= W,   where W is the vector length.
The restriction is good, but this wording doesn't specify what happens when %evl is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:

The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.

Exactly. The VE target strictly requires VL <= MVL or you'll get a hardware exception. Enforcing strict UB here means VP-users have to explicitly drop instructions that keep the VL within bounds. This means that we can optimize the VL computation code and that it can be factored into cost calculations, etc. With Options 2 & 3 this would happen only very late in the backend when most scalar optimizations are already done.
Besides, this still allows you to speculate as long as MVL (as in the UB-causing bound for VL) does not go below VL... could you explain under which circumstance MVL would go below VL by hoisting? This is definitely not the case for static VL targets (x86) and also not for VE.

TODO:

Define behavior for %evl > W
Amend that W is target specific.

llvm/docs/Proposals/VectorPredication.rst
2 ↗	(On Diff #228885)	I think so. I'll put the proposal doc up for review.

simoll mentioned this in D73889: [Doc] Proposal for vector predication.Feb 3 2020, 6:28 AM

simoll added a child revision: D73889: [Doc] Proposal for vector predication.Feb 3 2020, 6:36 AM

In D57504#1854330, @simoll wrote:

Exactly. The VE target strictly requires VL <= MVL or you'll get a hardware exception. Enforcing strict UB here means VP-users have to explicitly drop instructions that keep the VL within bounds. This means that we can optimize the VL computation code and that it can be factored into cost calculations, etc. With Options 2 & 3 this would happen only very late in the backend when most scalar optimizations are already done.

I think I'm lost here. Which thing is VL and which is MVL in this scenario?

Also, the talk about how various hardware treats the relative values of VL and MVL concerns me if either of these is supposed to be the width of the vector passed to this intrinsic. My understanding is that we're supposed to be able to generate vectors of any width we want in IR and the type legalization is responsible for mapping that to vector sizes that are legal for the target. So what does the target requirement mean here?

In D57504#1856207, @andrew.w.kaylor wrote:

In D57504#1854330, @simoll wrote:

Exactly. The VE target strictly requires VL <= MVL or you'll get a hardware exception. Enforcing strict UB here means VP-users have to explicitly drop instructions that keep the VL within bounds. This means that we can optimize the VL computation code and that it can be factored into cost calculations, etc. With Options 2 & 3 this would happen only very late in the backend when most scalar optimizations are already done.

I think I'm lost here. Which thing is VL and which is MVL in this scenario?

VL == %evl
MVL == W
Sorry for the vector speak :)

Also, the talk about how various hardware treats the relative values of VL and MVL concerns me if either of these is supposed to be the width of the vector passed to this intrinsic. My understanding is that we're supposed to be able to generate vectors of any width we want in IR and the type legalization is responsible for mapping that to vector sizes that are legal for the target. So what does the target requirement mean here?

I agree that, in the end, the semantics will be based solely on IR-types. However, what that semantics should look like for the %evl > W case depends on the way targets can handle this to make sure that whatever we specify on IR-level is at least reasonable for all targets.

From what I recall, the plan is to implement this by using fixed-size vector types combined with VL-based ops. MVL would be the size of those vector types.

Quoting all of lkcl's email so it ends up in Phabricator:

On Tue, Feb 4, 2020 at 3:48 AM @lkcl wrote:

In D57504#1856586, @simoll wrote:

In D57504#1856207, @andrew.w.kaylor wrote:

In D57504#1854330, @simoll wrote:

Exactly. The VE target strictly requires VL <= MVL or you'll get a
hardware exception. Enforcing strict UB here means VP-users have to
explicitly drop instructions that keep the VL within bounds. This means
that we can optimize the VL computation code and that it can be factored
into cost calculations, etc. With Options 2 & 3 this would happen only
very late in the backend when most scalar optimizations are already
done.

I think I'm lost here. Which thing is VL and which is MVL in this
scenario?

VL == %evl
MVL == W
Sorry for the vector speak :)

ah. right. that bit of information was important, simon :) without
clarification, i assumed W was the "required vector length at the
program loop level", whoops..

I agree that, in the end, the semantics will be based solely on IR-types.
However, what that semantics should look like for the %evl > W case
depends on the way targets can handle this to make sure that whatever we
specify on IR-level is at least reasonable for all targets.

okaaay, riight, so the purpose of the discussion is, e.g., to work out
how to represent things like for-loops in the strcpy example here, is
that right?

https://www.sigarch.org/simd-instructions-considered-harmful/

so %evl > W (i.e. %evl > MVL) in RVV, it is the very effort of trying
to *set* %evl to the loop length, this is retried *in every loop*.
and the implementation (in hardware) very very specifically -
unbeknownst to the programmer (and to the IR writer) - hard-limits
%evl *to* MVL.

to be clear: although the programmer *tries* to set %evl > MVL, this
*never happens*: %evl will *always* be actually set to <= MVL.

it's quite clever.

it is really really important - a critical part of the design of RVV
loops - that the programmer (or LLVM compiler developer in this case)
*not* even know or make any assumptions about what MVL will be. some
hardware will actually have MVL equal to 1. some really unbelievably
powerful and stupidly expensive hardware might have MVL equal to 65536
(yes really, 65536 wide vector ALUs) and the critical thing is, the
assembly code *does not care*. it still works perfectly on both,
despite the fact that you have no idea, really, what value MVL is
going to be.

SimpleV is different in that you absolutely must explicitly declare,
as part of any assembly loops (or any other instructions), precisely
and exactly how large MVL is to be. this is because it is an
"allocation of the number of scalar registers - from the *scalar*
regfile - to be used for the vector operation".

thus, for SimpleV, we do actually need a way in LLVM to represent
(set) MVL, because it is quite literally an "explicit reservation of a
certain size and number of registers".

think of it as a way to say "hey y'know these upcoming SIMD
instructions? yeah, we need to set them to all be of length 8 for this
set. then, like, next we need to set all the upcoming SIMD
instructions to 16, y'ken". actually they're not SIMD they're
vector-ops but you get the idea.

this we do with an *extra* parameter to the SV.SETVL instruction
https://libre-riscv.org/simple_v_extension/appendix/#index8h1

SV.SETVL a2, t4, 8 # MVL==8

now, *if* we have a way to set MVL (through LLVM-IR), we can *also*
use that for doing saving/restoring of entire scalar register files
with a single instruction, as well as use it for function call
register stack save/restore.

basically when we have control over MVL through LLVM-IR, we get a
"LD.MULTI" and "ST.MULTI" instruction "for free" as an accidental
side-benefit.

SV.SETMVL #32 ; tells the hardware that vector operations are to
use 32 *scalar* regs
SV.LD a0, f0, #8 ; loads registers f0 thru f31 from the address at (a0+8)

for SIMD systems such as x86 and ARM, the only way to keep loops as
simple as RVV and SV, you'd need an instruction which, when you got to
the last run through the loop, then whilst %evl would be set to some
fixed-width-at-the-SIMD-boundary, some predicate mask was set up
*instead*... and thus despite the SIMD operation still being 4 (or 8,
or 16), the elements at the end were left alone (masked out)

without such an instruction (one which sets up the predicate bitmask
as not being all 1s on the last loop) you'd have to have a sequence of
instructions that effectively do the same job, and those instructions
will, clearly, impact performance due to them being executed on each
and every loop.

this is, unless the above is expressly supported in a single
instruction (one equivalent to SETVL
which sets up the predicate mask on the last loop) i am sorry to have
to use this particular phrase, a dog's dinner approach when compared
to variable-run vectorisation, and it's why i keep warning that
attempting to add support for fixed-power-of-two-%evl in this proposal
is not a good idea.

even if you _do_ have such an instruction (or a really really short
sequence that's equivalent and does not impact the length of the loop
too badly), the fact that the assembly code has to use 16 wide SIMD if
you want to do high-performance but then if you have short loops you
are wasting ALU resources but if you use 4 wide SIMD to stop wasting
ALU resources you can't do high-performance, you are screwed both
coming and going, and, ultimately, have to resort to stripmining to
properly solve it, and at that point we're *definitely* outside of the
scope of this proposal [as i understand it].

l.

In D57504#1857309, @programmerjake wrote:

From what I recall, the plan is to implement this by using fixed-size vector types combined with VL-based ops. MVL would be the size of those vector types.

To be clear, I'm referring specifically to LLVM IR for SimpleV, not for other targets.

OK. I was picturing MVL as some sort of maximum supported by the hardware in some sense or context. I think(?) I've got it now.

So let me ask about how you're picturing this working on targets that don't support these non-fixed vector lengths. The comments from lkcl have me concerned that we're going to be asked to emulate this behavior, which is possible I suppose but probably not the best choice performance wise. Consider this call:

%sum = call <8 x double> @llvm.vp.fadd.f64(<8 x double> %x,<8 x double> %y, <8 x i1> %mask, i32 4)

Frankly, I'd hope never to see such a thing. We talked about using -1 for the %evl argument for targets that don't support variable vector length (is that the right phrase?), but what are we supposed to do if something else is used?

Disregarding the %evl argument for the moment, the x86 type legalizer might lower this as a masked <8 x double> fadd, or it might lower it as two <4 x double> fadd operations, or it might scalarize it entirely. Even if the target hardware supports 512-bit vectors we might choose to lower it as two <4 x double> fadds. Or we might not. The backend currently considers itself to have the freedom to do anything that meets the semantics of the intrinsic. So that brings up the question of whether we will be expected to honor the %evl argument. In this case, it would be fairly trivial to do so. However, the possibility raises a concern about what the code that generated this IR was trying to do and whether it is a reasonable thing to have done for x86 backends.

Basically, I want to actively discourage front ends and optimizations from using the %evl argument in cases where it won't be optimal.

In D57504#1857458, @andrew.w.kaylor wrote:
OK. I was picturing MVL as some sort of maximum supported by the hardware in some sense or context. I think(?) I've got it now.

So let me ask about how you're picturing this working on targets that don't support these non-fixed vector lengths. The comments from lkcl have me concerned that we're going to be asked to emulate this behavior, which is possible I suppose but probably not the best choice performance wise. Consider this call:
%sum = call <8 x double> @llvm.vp.fadd.f64(<8 x double> %x,<8 x double> %y, <8 x i1> %mask, i32 4)
Frankly, I'd hope never to see such a thing. We talked about using -1 for the %evl argument for targets that don't support variable vector length (is that the right phrase?), but what are we supposed to do if something else is used?

For targets that do not support %evl they can say so through TTI and the ExpandVectorPredicationPass will convert it into:

%mask.vl = icmp ult <8 x i1> <0,1,2,3,4,5,6,7>, ("splat' <8 x i32> 4)
%mask.new = and <8 x i1> %mask, %mask.vl
%sum = call <8 x double> @llvm.vp.fadd.f64(<8 x double> %x,<8 x double> %y, <8 x i1> %mask.new, i32 -1)

Basically, %evl never hits the X86 backend and can be ignored. The expansion pass implements one, unified, legalization strategy for all non-VL targets, achieving predictable behavior across targets.

Disregarding the %evl argument for the moment, the x86 type legalizer might lower this as a masked <8 x double> fadd, or it might lower it as two <4 x double> fadd operations, or it might scalarize it entirely. Even if the target hardware supports 512-bit vectors we might choose to lower it as two <4 x double> fadds. Or we might not. The backend currently considers itself to have the freedom to do anything that meets the semantics of the intrinsic. So that brings up the question of whether we will be expected to honor the %evl argument. In this case, it would be fairly trivial to do so. However, the possibility raises a concern about what the code that generated this IR was trying to do and whether it is a reasonable thing to have done for x86 backends.

I see two sources for VP intrinsics in code:
1.) Hand-written intrinsic code (if we expose VP as C intrinsics in Clang and/or somebody directly implements say a math library in VP, ..)
We do not claim performance portability for VP code. If your actual target is AVX512 and you use VP intrinsics, do not use the %evl parameter (or know how the expansion pass is going to lower it and exploit that).

2.) Optimization passes and (vectorizing) frontends
Vectorizers/frontends should query TTI to decide whether they should be using %evl.
For VL targets, the loop vectorizer could use %evl to implement tail loop predication (as in the DAXPY example https://www.sigarch.org/simd-instructions-considered-harmful/ , linked by @lkcl).
For non-VL targets, you should make the iteration mask the root mask of all other predicates in the loop and set %evl to -1.

Basically, I want to actively discourage front ends and optimizations from using the %evl argument in cases where it won't be optimal.

TTI would tell front ends and optimizations that %evl is a no-go for your target. Is this enough discouragement?

In D57504#1861256, @simoll wrote:

Basically, I want to actively discourage front ends and optimizations from using the %evl argument in cases where it won't be optimal.

TTI would tell front ends and optimizations that %evl is a no-go for your target. Is this enough discouragement?

In theory, yes. In practice, it will depend on how optimizations make use of that information. Your explanation of how the ExpandVectorPredicationPass will make this palatable to the backend worries me a little, because it essentially means that optimizations don't have to care that the target doesn't support this feature. They can generate IR that uses it and EVPP will smooth over it. Obviously, we could handle this on a case-by-case basis as it comes up. As you say, TTI will provide sufficient information for passes to make the decision.

2.) Optimization passes and (vectorizing) frontends
Vectorizers/frontends should query TTI to decide whether they should be using %evl.
For VL targets, the loop vectorizer could use %evl to implement tail loop predication (as in the DAXPY example https://www.sigarch.org/simd-instructions-considered-harmful/ , linked by @lkcl).
For non-VL targets, you should make the iteration mask the root mask of all other predicates in the loop and set %evl to -1.

FWIW this is the approach we plan to use at BSC to vectorize using RISC-V extension. We're currently adding mask information to VPlan recipes that when executed should emit VPred operations with masking. Our plan includes a vplan→vplan transformation that would express the "root" mask as a "set vector length" operation.

In D57504#1862202, @andrew.w.kaylor wrote:

TTI would tell front ends and optimizations that %evl is a no-go for your target. Is this enough discouragement?

In theory, yes. In practice, it will depend on how optimizations make use of that information. Your explanation of how the ExpandVectorPredicationPass will make this palatable to the backend worries me a little, because it essentially means that optimizations don't have to care that the target doesn't support this feature. They can generate IR that uses it and EVPP will smooth over it. Obviously, we could handle this on a case-by-case basis as it comes up. As you say, TTI will provide sufficient information for passes to make the decision.

ok so it is starting to sink in what is being proposed: a *mainstream* pass in llvm that *always* puts in vector predication, and then various backends, depending on hardware capability, will either have passes that turn that mandatory vector predication into scalar loops, or SIMD / SIMT (getting rid of %evl in the process), or, in the case of Cray-inspired hardware, calling SETVL assembly code.

if that's accurate, then wow that's quite bold and has a lot of advantages.

i have a suggestion. for SimpleV we.definitely need to have an explicit way to specify MVL. this because it is literally specifying precisely how many scalar registers are to be allocated for a vector op.

however for SIMD (ARM, x86, other) i have a suspicion that being able to "hint" the best size of SIMD instruction width to use is probably a good idea.

if a SIMD width hint is available it happens to be synonymous with SimpleV's (hard) requirent to be able to specify MVL.

a scalar system would ignore both %evl and %mvl (or better mpvl - max partition vector length) i.e passes woule eliminate them.

a SIMD system would use %mpvl to choose the best SIMD opcodes for the job, the passes would subdivide work into such chunks then generate the suitablr cornercase last loop as well, *ignoring* %evl in the process.

SimpleV would use both to generate opcodes, coordinating with the regfile allocator, correctly and efficiently.

simoll mentioned this in rGc49b9e0d3284: [Doc] Proposal for vector predication.Feb 10 2020, 1:36 AM

In D57504#1862968, @lkcl wrote:

i have a suggestion. for SimpleV we.definitely need to have an explicit way to specify MVL. this because it is literally specifying precisely how many scalar registers are to be allocated for a vector op.

Would it work for you if we leave the definition of MVL for scalable types to the targets?

This would allow you (and ARM MVE/SVE , RISC-V V) to have their own mechanism for setting/querying MVL.
Besides, i think that defining MVL is out of the scope of this RFC given the diversity of scalable vector ISAs right now.. again a point we could revisit should all scalable vector ISAs someday agree on one way to define MVL.

The up-to-date list of planned changes (also for this patch) is here: https://reviews.llvm.org/D69891#1871485

In D57504#1871521, @simoll wrote:

In D57504#1862968, @lkcl wrote:

i have a suggestion. for SimpleV we.definitely need to have an explicit way to specify MVL. this because it is literally specifying precisely how many scalar registers are to be allocated for a vector op.

Would it work for you if we leave the definition of MVL for scalable types to the targets?

mmm... honestly? probably not. however we can get away with either inline assembler (for a very limited subset of requirements) or just going "y'know what, let's just set MVL hard-coded to default to 4 or 8 for all loops", for now, as best matched to the (planned) maximum internal register read/write ports for our first chip.

This would allow you (and ARM MVE/SVE , RISC-V V) to have their own mechanism for setting/querying MVL.

and x86-for-hinting-the-SIMD-length. [for anyone who may be under the impression that RVV does not need the concept of MVL: see the sub-extension which fits the vector regfile onto the scalar (FP) regfile. if the FP regfile is to be used and useful at the same time, then there needs to be a way to explicity define how much of the FP regfile is to be allocated (to* RVV, and that in turn means being able to define the number of "lanes" to actually be used... which is, funnily enough, exactly what *setting* MVL. N(Lanes) == MVL. MVL == N(Lanes) ].

Besides, i think that defining MVL is out of the scope of this RFC given the diversity of scalable vector ISAs right now..

this is cool and exciting.

again a point we could revisit should all scalable vector ISAs someday agree on one way to define MVL.

yes, as a separate proposal.

In D57504#1871991, @lkcl wrote:

In D57504#1871521, @simoll wrote:

In D57504#1862968, @lkcl wrote:

i have a suggestion. for SimpleV we.definitely need to have an explicit way to specify MVL. this because it is literally specifying precisely how many scalar registers are to be allocated for a vector op.

Would it work for you if we leave the definition of MVL for scalable types to the targets?

mmm... honestly? probably not. however we can get away with either inline assembler (for a very limited subset of requirements) or just going "y'know what, let's just set MVL hard-coded to default to 4 or 8 for all loops", for now, as best matched to the (planned) maximum internal register read/write ports for our first chip.

I think i wasn't clear: what i meant to say is that we will not decide how MVL is defined/queried/set in the scope of this RFC... potentially leading to the situation that every target comes with its own set of target intrinsics to do so.

This would allow you (and ARM MVE/SVE , RISC-V V) to have their own mechanism for setting/querying MVL.

and x86-for-hinting-the-SIMD-length.

For x86 with scalable types, yes. For "classic" SIMD types MVL == W of <W x type>

<snip> [for anyone who may be under the impression that RVV does not need the concept of MVL: see the sub-extension which fits the vector regfile onto the scalar (FP) regfile. if the FP regfile is to be used and useful at the same time, then tere needs to be a way to explicity define how much of the FP regfile is to be allocated (to* RVV, and that in turn means being able to define the number of "lanes" to actually be used... which is, funnily enough, exactly what *setting* MVL. N(Lanes) == MVL. MVL == N(Lanes) ].

Besides, i think that defining MVL is out of the scope of this RFC given the diversity of scalable vector ISAs right now..

this is cool and exciting.

Yep, and we wouldn't get near the level of support for this RFC otherwise.

again a point we could revisit should all scalable vector ISAs someday agree on one way to define MVL.

yes, as a separate proposal.

In D57504#1854330, @simoll wrote:

Exactly. The VE target strictly requires VL <= MVL or you'll get a hardware exception. Enforcing strict UB here means VP-users have to explicitly drop instructions that keep the VL within bounds. This means that we can optimize the VL computation code and that it can be factored into cost calculations, etc. With Options 2 & 3 this would happen only very late in the backend when most scalar optimizations are already done.

Ok, I didn't realize VE's SETVL works like that. In that case we don't have much of a choice, unfortunately.

Besides, this still allows you to speculate as long as MVL (as in the UB-causing bound for VL) does not go below VL... could you explain under which circumstance MVL would go below VL by hoisting? This is definitely not the case for static VL targets (x86) and also not for VE.

Of course, for lots of IR that we care about in practice, it will be quite simple to see that hoisting is safe, e.g. because:

%evl it is a constant -1
%evl is computed in a way that can be recognized to produce a small enough value (typical strip-mined loops)
there are earlier unconditional VP operations with the same EVL value (most vectorized functions)

But you need some such analysis, and must not hoist when those tricks all fail, because there's no general guarantee that the condition you're hoisting out of is independent from "%evl > element count?". A trivial (if pathological) example of this is when the condition never true in any execution and the EVL value is larger than W. A more real-world example, if you insist, comes from one proposed way to port hand-crafted fixed-width SIMD algorithms to RVV: check at runtime whether vector registers are at least as large as required by the SIMD algorithm, if so set the VL register to a constant and execute vector code, otherwise fall back to another implementation. This might mean having vp.foo(..., i32 4) instructions guarded by a runtime check that effectively determines whether that 4 is a legal value, and hoisting the computation out of the condition introduces UB in the executions where it isn't.

Whether this would lead to any end-to-end miscompilations is another question, but that's not a good excuse to implement known-incorrect optimizations.

In D57504#1872310, @simoll wrote:

I think i wasn't clear: what i meant to say is that we will not decide how MVL is defined/queried/set in the scope of this RFC... potentially leading to the situation that every target comes with its own set of target intrinsics to do so.

ah yes got you.

This would allow you (and ARM MVE/SVE , RISC-V V) to have their own mechanism for setting/querying MVL.

and x86-for-hinting-the-SIMD-length.

For x86 with scalable types, yes. For "classic" SIMD types MVL == W of <W x type>

mmm... i don't believe that's a wise choice / decision / assumption. i am partly-guessing-and-making-architectural-assumptions here: imagine that the (very-well-informed) programmer knows how the pipelines of a particular processor work (and i do mean very well), they know that there are a couple of separate pipelines, one which handles e.g. NxFP32, one which handles MxFP64, but that if you issue SIMD instructions of width N=Mx2, it will result in a "blockage" (stall) and under-utilisation.

*however*... if you issue *half* the workload (i.e. MVL == W/2) for the FP32 instructions interleaved with "full" workload (MVL==W for the FP64 ops), *then*, because of the way that the architecture works the two suites of instructions *will* go to the separate pipelines, *will* get done in parallel, because you're not overloading the exact same 64-bit-wide pipeline entrypoint if you'd done... you get what i'm trying to say?

i think what i'm trying to say works better for MMX (the instructions which shared the FP regfile with SIMD instructions, is that right? or is it SSE?) - there you definitely want control over how much of the regfile is allocated to SIMD and how much remains actual for scalar-FP usage, and if MVL == W as a hard-coded assumption, with no "hint", you could end up taking up far more of the FP regfile for SIMD MMX than is efficient / effective.

however... if the compiler could be *explicitly* told, "hey i want you to use only W/2 or W/4 worth of the FP regfile for SIMD operations please, and to automatically create a 2x or 4x loop that makes up for it *as if* you had done a full MVL==W single SIMD instruction", then it becomes possible to create a balance there which will not hammer the L1/L2 cache with LD/ST operations, consuming far more power than necessary, because the SIMD instructions completely dominate the entirety of the FP regfile.

we quickly learned from 3D workloads that they are very computationally-intensive and fit a "LD, massive-amounts-of-SIMD-processing, ST" pattern with *very* little in the way of overlaps. consequently, if the compiler generates:

LD
half-the-processing-because-there's-not-enough-registers
ST-some-temps
do-some-more-processing
LD-out-of-temps, do-a-bit-more-processing
ST

this is horribly, horribly power-inefficient.

so being able to balance the workload, keep things entirely in the regfile even if it means using half-wide (or quarter-wide) SIMD ops and the loops taking twice or 4 times longer in order to avoid the spill into temporary LD/STs, this is far more important than trying to make "individual" SIMD operations (ones that consume far too much of the regfile and result in LD/ST "spill") as wide as possible.

again, however: i'm raising this not to suggest that it be part of *this* RFC, i'm just document it to make sure it's not forgotten, for later.

Besides, i think that defining MVL is out of the scope of this RFC given the diversity of scalable vector ISAs right now..

this is cool and exciting.

Yep, and we wouldn't get near the level of support for this RFC otherwise.

yehyeh.

In D57504#1872374, @lkcl wrote:

In D57504#1872310, @simoll wrote:

I think i wasn't clear: what i meant to say is that we will not decide how MVL is defined/queried/set in the scope of this RFC... potentially leading to the situation that every target comes with its own set of target intrinsics to do so.

ah yes got you.

This would allow you (and ARM MVE/SVE , RISC-V V) to have their own mechanism for setting/querying MVL.

and x86-for-hinting-the-SIMD-length.

For x86 with scalable types, yes. For "classic" SIMD types MVL == W of <W x type>

mmm... i don't believe that's a wise choice / decision / assumption. i am partly-guessing-and-making-architectural-assumptions here: imagine that the (very-well-informed) programmer knows how the pipelines of a particular processor work (and i do mean very well), they know that there are a couple of separate pipelines, one which handles e.g. NxFP32, one which handles MxFP64, but that if you issue SIMD instructions of width N=Mx2, it will result in a "blockage" (stall) and under-utilisation.

*however*... if you issue *half* the workload (i.e. MVL == W/2) for the FP32 instructions interleaved with "full" workload (MVL==W for the FP64 ops), *then*, because of the way that the architecture works the two suites of instructions *will* go to the separate pipelines, *will* get done in parallel, because you're not overloading the exact same 64-bit-wide pipeline entrypoint if you'd done... you get what i'm trying to say?

i think what i'm trying to say works better for MMX (the instructions which shared the FP regfile with SIMD instructions, is that right? or is it SSE?) - there you definitely want control over how much of the regfile is allocated to SIMD and how much remains actual for scalar-FP usage, and if MVL == W as a hard-coded assumption, with no "hint", you could end up taking up far more of the FP regfile for SIMD MMX than is efficient / effective.

MMX does use the X87 FP register file, but they can't coexist at the same. The first use of MMX marks the X87 register stack as occupied. I can't remember if it alters the data or not. An explicit emms instruction has to be done at the end of the MMX code to erase the MMX data and make the registers usable for X87 again.

In D57504#1854330, @simoll wrote:

But you need some such analysis, and must not hoist when those tricks all fail, because there's no general guarantee that the condition you're hoisting out of is independent from "%evl > element count?". A trivial (if pathological) example of this is when the condition never true in any execution and the EVL value is larger than W. A more real-world example, if you insist, comes from one proposed way to port hand-crafted fixed-width SIMD algorithms to RVV: check at runtime whether vector registers are at least as large as required by the SIMD algorithm, if so set the VL register to a constant and execute vector code,

ah... ah... you can't. at least, the last version of the RVV spec that i read (7?) still explicity states, "regardless of what *you* want VL to be set to, the *hardware* gets to decide exactly what value *actually* goes into the VL CSR".

the only guarantee that you have is that you will find that if you set VL to a non-zero value, you will find that, when you read it immediately after setting, it will be non-zero.

this specifically *does not matter* on RVV (sigh: when RVV is not done on top of the FP regfile, and there is a separate vector regfile), because the vector regfile is specifically designed to refer to *vectors*... not to invididual elements.

for SimpleV, because we designed it right from the start to sit on top of the int and fp regfiles, what VL is set to *really does matter*, because it defines precisely and exactly how many of the scalar registers are to be used *as* "vector elements".

thus, for RVV, when converting SIMD assembly patterns to RVV, you absolutely *must* use the "loop pattern" described in https://www.sigarch.org/simd-instructions-considered-harmful/

if you try to hard-code-set VL to anything specific, this has the (unintended) side-effect of destroying the entire paradigm on which RVV is based, namely that you are not *supposed* to know the actual hardware vector "lane" size... at all. so, if you had really minimalist hardware which only *had* one actual "Lane", then if you tried to explicitly set VL=4, that hardware is absolutely hosed, as it is literally unable to support, at the hardware level, the three extra lanes requested/demanded.

this is why you have to "ask" for a VL, and the instruction will put the *actual* number of elements that VL got set to into a destination register, because you need to subtract that number of (processed) elements from the loop.

of course, with the idea of dropping RVV on top of the FP regfile that goes somewhat out the window. however i'm not... welcome, shall we say... in the RV WG participation, so you'd need to take this up with them, directly. and try not to mention my name too much because they're quite likely to sabotage things (to everyone's detriment) just because i was the one that came up with the insights. *shakes head*...

In D57504#1872392, @craig.topper wrote:

MMX does use the X87 FP register file, but they can't coexist at the same. The first use of MMX marks the X87 register stack as occupied. I can't remember if it alters the data or not. An explicit emms instruction has to be done at the end of the MMX code to erase the MMX data and make the registers usable for X87 again.

craig, thank you for correcting me. that makes a lot of sense as i can just imagine the x87 designers going "argh, how are we going to avoid a pipeline clash / mess, here" :)

you get the principle i am sure, even though MMX is not a suitable example.

In D57504#1872412, @lkcl wrote:

ah... ah... you can't. at least, the last version of the RVV spec that i read (7?) still explicity states, "regardless of what *you* want VL to be set to, the *hardware* gets to decide exactly what value *actually* goes into the VL CSR".

the only guarantee that you have is that you will find that if you set VL to a non-zero value, you will find that, when you read it immediately after setting, it will be non-zero.

I don't know where you have gotten this idea, it has never been true for as long as I can recall. While RVV implementations have some freedom in how they set VL, there are also lots of rules governing their behavior. Most relevantly, since October 2018 (spec version 0.5-draft), programs requesting something less than or equal to the maximum VL will get exactly that number as VL, no something smaller. And even before that change, there were long-standing significant restrictions on how VL is determined beyond what you claim (see the linked commit).

Furthermore, even if what you said was true, it would not make the scheme I described invalid. VL does not change without the program deliberately executing one of a few instructions that change VL (this is already necessary for any strip-mined loop to work at all). Thus, after executing a SETVL it's enough to inspect the resulting VL to know whether it's safe to execute code that assumes a particular value of VL. More freedom in how VL is determined by the processor just means more possibilities for unnecessarily hitting the fallback path, but that only impacts performance rather than correctness.

In D57504#1876242, @rkruppe wrote:

In D57504#1872412, @lkcl wrote:

ah... ah... you can't. at least, the last version of the RVV spec that i read (7?) still explicity states, "regardless of what *you* want VL to be set to, the *hardware* gets to decide exactly what value *actually* goes into the VL CSR".

the only guarantee that you have is that you will find that if you set VL to a non-zero value, you will find that, when you read it immediately after setting, it will be non-zero.

I don't know where you have gotten this idea, it has never been true for as long as I can recall. While RVV implementations have some freedom in how they set VL, there are also lots of rules governing their behavior. Most relevantly, since October 2018 (spec version 0.5-draft), programs requesting something less than or equal to the maximum VL will get exactly that number as VL, no something smaller. And even before that change, there were long-standing significant restrictions on how VL is determined beyond what you claim (see the linked commit).

remember, with the exclusion from discussion due to the anti-trust practices of the RISC-V Foundation, everyone on the "outside" of the RVV working group process has to "reverse-engineer" what the hell is going on. so please do be patient if i make mistakes, as i am not really very happy spending our sponsor's and donor's time (and money) extracting information from the RVV WG in this way (and shouldn't have to).

Furthermore, even if what you said was true, it would not make the scheme I described invalid.

if you are describing replacing a SIMD loop with a *single* instruction, prefixed with a "SETVL", then my understanding is that yes, it would be... *on some hardware*. if the intention is never to be fully-compatible with *all* RVV-compatible hardware, then that's fine.

think it through: imagine some hardware that has only one "lane". that hardware will ONLY have an *absolute* maximum value for MVL: one.

therefore, if you try to set VL to anything greater than 1, it will *only* permit VL to be set to 1.

the variable nature of MVL on a per-implementor basis has caused other problems as well, particularly in the element-offset (VSLIDE?) instructions. it's been a contentious issue.

VL does not change without the program deliberately executing one of a few instructions that change VL (this is already necessary for any strip-mined loop to work at all). Thus, after executing a SETVL it's enough to inspect the resulting VL to know whether it's safe to execute code that assumes a particular value of VL.

ahhh, okaay, right. i get it. so, you'd have:

SETVL a5, 4 # a5 is the dest reg where VL gets stored
if (a5 != 4)
{

go to fallback loop

}

More freedom in how VL is determined by the processor just means more possibilities for unnecessarily hitting the fallback path, but that only impacts performance rather than correctness.

i would argue that even the check itself - having the fallback path at all - impacts performance (and increases code size).

this is why, in SimpleV, we make it mandatory that even if the underlying hardware does not have a large number of lanes, the implementation *must* provide "virtual" hardware - in effect a hardware for-loop. one other processor which does exactly this is the Broadcomm VideoCore IV. it gives the *impression* of having a 16-wide FP32 SIMD capability, whereas in fact it only has a 4x FP32 operation and the hardware delays for 4 additional cycles, pushing 4 *sets* of 4x FP32 into the (one) 4-wide FP32 pipeline.

In D57504#1872493, @lkcl wrote:

In D57504#1872392, @craig.topper wrote:

MMX does use the X87 FP register file, but they can't coexist at the same. The first use of MMX marks the X87 register stack as occupied. I can't remember if it alters the data or not. An explicit emms instruction has to be done at the end of the MMX code to erase the MMX data and make the registers usable for X87 again.

craig, thank you for correcting me. that makes a lot of sense as i can just imagine the x87 designers going "argh, how are we going to avoid a pipeline clash / mess, here" :)

you get the principle i am sure, even though MMX is not a suitable example.

I don't know about Craig, but I'm not sure I do get the principle. For any given target we have a known maximum vector width (as in total number of bits, not number of elements) that is discoverable through TargetTransformInfo. We also have a "preferred" vector width that gets a default value based on the target architecture, but can be overridden by a command line option and may change what TargetTransformInfo tells you. However, the IR is not bound by these. The optimizer and any front end can generate whatever vectors they like. If some wacky optimization wants to create a <23 x float> vector, that's legal IR. However, when it gets to the backend, the type legalizer is going to do something to break it down into chunks that can be consumed by the processor. To get nicely optimized code, there needs to be cooperation between the optimizer and the backend.

This is why I mentioned before that the discussion of architecture specific details in the context of defining the semantics of the IR is making me nervous. LLVM IR is designed to be target-independent. The VP semantics need to respect that.

That's not to say we can ignore target-specific details. We have two distinct lanes though -- (1) the semantics of the IR, and (2) the mechanisms by which the target details can be discovered so that pre-codegen components can tune the IR for a specific target. We need to make sure the IR semantics are rich enough to represent the details of all targets we intend to support, but the details of the target shouldn't be visible in the IR semantics. Maybe I'm preaching to the choir here. I just want to make sure we're all on the same page. Perhaps this would be cleared up if I had a better understanding of what you were saying.

In D57504#1877261, @andrew.w.kaylor wrote:

Perhaps this would be cleared up if I had a better understanding of what you were saying.

appreciated. if it's ok, can we schedule that for when it's part of a (new) proposal?

In D57504#1877268, @lkcl wrote:

In D57504#1877261, @andrew.w.kaylor wrote:

Perhaps this would be cleared up if I had a better understanding of what you were saying.

appreciated. if it's ok, can we schedule that for when it's part of a (new) proposal?

Sure.

rebased
various fixes
includes llangref rephrasing and atest changes to VP integer patch

Harbormaster failed remote builds in B47153: Diff 246280!Feb 24 2020, 1:06 PM

HsiangKai added a subscriber: HsiangKai.Feb 25 2020, 11:44 PM

FYI, the test failures you are seeing here are due to the generalized pattern matching doing a better job at matching the fsub idiom for fneg. The required test changes are included in https://reviews.llvm.org/D75467 .

Herald added a subscriber: danielkiss. · View Herald TranscriptMar 10 2020, 5:18 AM

Rebased
%evl <= W or UB ensues
fixed LangRef wording

The VP integer intrinsic patch has been accepted. I'll wait until next week with the commit to leave time for comments.

FYI, the VP-integer intrinsics & langref patch is in. Next up: expansion to standard SIMD IR. I'll announce the next patch also on llvm-dev when it's on phabricator.

rkruppe mentioned this in D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.Mar 19 2020, 11:14 AM

huihuisun1991 added a subscriber: huihuisun1991.Mar 28 2020, 2:35 AM

evandro added a subscriber: evandro.Mar 31 2020, 1:49 PM

simoll added a child revision: D78203: [VP,Integer,#2] ExpandVectorPredication pass.Apr 15 2020, 6:54 AM

m_zuckerman added a reviewer: m_zuckerman.Jun 2 2020, 6:18 AM

TODO: adapt to changes of get.active.lane.mask.

Herald added a reviewer: jdoerfert. · View Herald TranscriptAug 28 2020, 5:16 AM

Herald added a reviewer: jdoerfert. · View Herald Transcript

Herald added subscribers: nikic, steven.zhang. · View Herald Transcript

venkataramanan.kumar.llvm added a subscriber: venkataramanan.kumar.llvm.Sep 9 2020, 10:37 AM

Herald added a subscriber: ecnelises. · View Herald TranscriptSep 9 2020, 10:37 AM

mdchen added a subscriber: mdchen.Oct 13 2020, 4:46 AM

Kazhuu added a subscriber: Kazhuu.Oct 19 2020, 7:16 AM

xmj added a subscriber: xmj.Oct 20 2020, 11:03 PM

frasercrmck added a subscriber: frasercrmck.Oct 22 2020, 3:29 AM

dnsampaio added a subscriber: dnsampaio.Oct 22 2020, 5:43 AM

simoll added a child revision: D91441: [VP] Build VP SDNodes.Nov 13 2020, 9:38 AM

simoll mentioned this in D92086: Generalized PatternMatch & InstSimplify.Nov 25 2020, 3:01 AM

simoll added a child revision: D92086: Generalized PatternMatch & InstSimplify.Nov 25 2020, 4:58 AM

simoll removed a child revision: D92086: Generalized PatternMatch & InstSimplify.Nov 25 2020, 5:42 AM

Hi @simoll: a quick question regarding vp.load/vp.store/vp.gather/vp.scatter. Does the current definition of VPred allow for something similar to the !nontemporal metadata of regular load/store instructions? I don't see any explicit mention to that but maybe it is already possible using metadata or some other annotation?

Thanks!

rkruppe removed a reviewer: rkruppe.Dec 2 2020, 9:08 AM

rkruppe removed a subscriber: rkruppe.

In D57504#2424884, @rogfer01 wrote:

Hi @simoll: a quick question regarding vp.load/vp.store/vp.gather/vp.scatter. Does the current definition of VPred allow for something similar to the !nontemporal metadata of regular load/store instructions? I don't see any explicit mention to that but maybe it is already possible using metadata or some other annotation?

First time i learn about !nontemporal metadata. I'd be absolutely in favor for supporting this also in VP mem ops!

@hussainjk I don't think we need to support non-temporal md hints right from the start (we can tag on md later) but it'd be great to have a vp.load/store patch with just the intrinsics on Phabricator to start discussions like this and make progress on VP mem ops.

khchen added a subscriber: khchen.Dec 4 2020, 7:13 AM

troyj added a subscriber: troyj.Jan 22 2021, 7:27 AM

hussainjk mentioned this in D99355: Implementation of intrinsic and SDNode definitions for VP load, store, gather, scatter..Mar 25 2021, 10:56 AM

FYI. There is a biweekly syncup call on VP (Tue, 3pm CET, next: 2021-05-11)

Minutes (with zoom link): https://docs.google.com/document/d/1q26ToudQjnqN5x31zk8zgq_s0lem1-BF8pQmciLa4k8/edit?usp=sharing

Contact me, if you want to join our Discord server.

simoll mentioned this in D93470: [VP] Binary floating-point intrinsics..Jun 7 2021, 5:40 AM

frasercrmck mentioned this in D104308: [VP] Add vector-predicated reduction intrinsics.Jun 15 2021, 9:41 AM

simoll mentioned this in rGd21a35ac0a95: [VP] Implementation of intrinsic and SDNode definitions for VP load, store….Jul 1 2021, 4:35 AM

NigelYu added a subscriber: NigelYu.Aug 12 2021, 12:19 PM

Herald added subscribers: ctetreau, ormris. · View Herald TranscriptAug 12 2021, 12:19 PM

frasercrmck mentioned this in rGf3e9047249d0: [VP] Add vector-predicated reduction intrinsics.Aug 17 2021, 10:06 AM

tim.schmielau added a subscriber: tim.schmielau.Apr 14 2022, 5:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2022, 5:47 AM

Herald added subscribers: luke957, arichardson. · View Herald Transcript

liaolucy added a subscriber: liaolucy.May 24 2022, 7:34 PM

rui.zhang added a subscriber: rui.zhang.Sep 7 2022, 9:36 AM

Herald added subscribers: • pcwang-thead, nlopes. · View Herald TranscriptSep 7 2022, 9:36 AM

pshung added a subscriber: pshung.May 3 2023, 11:48 PM

Herald added subscribers: hoy, StephenFan. · View Herald TranscriptMay 3 2023, 11:48 PM

harishcse44 added a subscriber: harishcse44.May 23 2023, 1:21 AM

sunshaoce added a subscriber: sunshaoce.Aug 17 2023, 2:35 AM

Herald added a subscriber: wangpc. · View Herald TranscriptAug 17 2023, 2:35 AM

evandro removed a subscriber: evandro.Aug 17 2023, 5:08 PM

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

788 lines

include/

llvm/

Analysis/

InstructionSimplify.h

14 lines

TargetTransformInfo.h

36 lines

TargetTransformInfoImpl.h

13 lines

Bitcode/

LLVMBitCodes.h

3 lines

CodeGen/

ExpandVectorPredication.h

23 lines

71 lines

5 lines

28 lines

236 lines

IR/

9 lines

13 lines

24 lines

166 lines

490 lines

65 lines

313 lines

438 lines

184 lines

619 lines

1 line

Target/

TargetSelectionDAG.td

118 lines

lib/

Analysis/

InstructionSimplify.cpp

62 lines

TargetTransformInfo.cpp

10 lines

AsmParser/

LLLexer.cpp

3 lines

LLParser.cpp

9 lines

LLToken.h

3 lines

Bitcode/

Reader/

BitcodeReader.cpp

18 lines

Writer/

BitcodeWriter.cpp

6 lines

CodeGen/

CMakeLists.txt

1 line

ExpandVectorPredication.cpp

690 lines

SelectionDAG/

DAGCombiner.cpp

212 lines

LegalizeIntegerTypes.cpp

57 lines

LegalizeTypes.h

4 lines

SelectionDAG.cpp

372 lines

SelectionDAGBuilder.h

6 lines

SelectionDAGBuilder.cpp

308 lines

SelectionDAGDumper.cpp

5 lines

TargetPassConfig.cpp

5 lines

IR/

6 lines

3 lines

21 lines

54 lines

634 lines

115 lines

182 lines

45 lines

Transforms/

InstCombine/

InstCombineAddSub.cpp

93 lines

InstCombineCalls.cpp

12 lines

InstCombineInternal.h

13 lines

Utils/

CodeExtractor.cpp

3 lines

test/

Bitcode/

attributes.ll

5 lines

CodeGen/

AArch64/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

ARM/

O3-pipeline.ll

1 line

Generic/

expand-vp.ll

245 lines

X86/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

Transforms/

InstCombine/

vp-fsub.ll

45 lines

InstSimplify/

vp-fsub.ll

55 lines

Verifier/

vp-intrinsics-constrained.ll

17 lines

vp-intrinsics.ll

190 lines

vp_attributes.ll

13 lines

tools/

llc/

llc.cpp

1 line

opt/

opt.cpp

1 line

unittests/

IR/

CMakeLists.txt

1 line

VPIntrinsicTest.cpp

223 lines

utils/

TableGen/

CodeGenIntrinsics.h

8 lines

CodeGenTarget.cpp

15 lines

IntrinsicEmitter.cpp

31 lines

Diff 246280

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,707 Lines • ▼ Show 20 Lines

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var		<result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var

		.. _i_sub:

'``sub``' Instruction		'``sub``' Instruction
^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var		<result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
<result> = fsub float -0.0, %val ; yields float:result = -%var		<result> = fsub float -0.0, %val ; yields float:result = -%var

		.. _i_mul:

'``mul``' Instruction		'``mul``' Instruction
^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var		<result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var

		.. _i_udiv:

'``udiv``' Instruction		'``udiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 30 Lines

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = udiv i32 4, %var ; yields i32:result = 4 / %var		<result> = udiv i32 4, %var ; yields i32:result = 4 / %var

		.. _i_sdiv:

'``sdiv``' Instruction		'``sdiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var		<result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var

		.. _i_urem:

'``urem``' Instruction		'``urem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 28 Lines

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = urem i32 4, %var ; yields i32:result = 4 % %var		<result> = urem i32 4, %var ; yields i32:result = 4 % %var

		.. _i_srem:

'``srem``' Instruction		'``srem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
-------------------------		-------------------------

Bitwise binary operators are used to do various forms of bit-twiddling		Bitwise binary operators are used to do various forms of bit-twiddling
in a program. They are generally very efficient instructions and can		in a program. They are generally very efficient instructions and can
commonly be strength reduced from other instructions. They require two		commonly be strength reduced from other instructions. They require two
operands of the same type, execute an operation on them, and produce a		operands of the same type, execute an operation on them, and produce a
single value. The resulting value is the same type as its operands.		single value. The resulting value is the same type as its operands.

		.. _i_shl:

'``shl``' Instruction		'``shl``' Instruction
^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 36 Lines
.. code-block:: text		.. code-block:: text

<result> = shl i32 4, %var ; yields i32: 4 << %var		<result> = shl i32 4, %var ; yields i32: 4 << %var
<result> = shl i32 4, 2 ; yields i32: 16		<result> = shl i32 4, 2 ; yields i32: 16
<result> = shl i32 1, 10 ; yields i32: 1024		<result> = shl i32 1, 10 ; yields i32: 1024
<result> = shl i32 1, 32 ; undefined		<result> = shl i32 1, 32 ; undefined
<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>		<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>

		.. _i_lshr:


'``lshr``' Instruction		'``lshr``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 33 Lines	.. code-block:: text

<result> = lshr i32 4, 1 ; yields i32:result = 2		<result> = lshr i32 4, 1 ; yields i32:result = 2
<result> = lshr i32 4, 2 ; yields i32:result = 1		<result> = lshr i32 4, 2 ; yields i32:result = 1
<result> = lshr i8 4, 3 ; yields i8:result = 0		<result> = lshr i8 4, 3 ; yields i8:result = 0
<result> = lshr i8 -2, 1 ; yields i8:result = 0x7F		<result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
<result> = lshr i32 1, 32 ; undefined		<result> = lshr i32 1, 32 ; undefined
<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>		<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>

		.. _i_ashr:

'``ashr``' Instruction		'``ashr``' Instruction
^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 34 Lines	.. code-block:: text

<result> = ashr i32 4, 1 ; yields i32:result = 2		<result> = ashr i32 4, 1 ; yields i32:result = 2
<result> = ashr i32 4, 2 ; yields i32:result = 1		<result> = ashr i32 4, 2 ; yields i32:result = 1
<result> = ashr i8 4, 3 ; yields i8:result = 0		<result> = ashr i8 4, 3 ; yields i8:result = 0
<result> = ashr i8 -2, 1 ; yields i8:result = -1		<result> = ashr i8 -2, 1 ; yields i8:result = -1
<result> = ashr i32 1, 32 ; undefined		<result> = ashr i32 1, 32 ; undefined
<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>		<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>

		.. _i_and:

'``and``' Instruction		'``and``' Instruction
^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 33 Lines
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = and i32 4, %var ; yields i32:result = 4 & %var		<result> = and i32 4, %var ; yields i32:result = 4 & %var
<result> = and i32 15, 40 ; yields i32:result = 8		<result> = and i32 15, 40 ; yields i32:result = 8
<result> = and i32 4, 8 ; yields i32:result = 0		<result> = and i32 4, 8 ; yields i32:result = 0

		.. _i_or:

'``or``' Instruction		'``or``' Instruction
^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

Show All 33 Lines
""""""""		""""""""

::		::

<result> = or i32 4, %var ; yields i32:result = 4 \| %var		<result> = or i32 4, %var ; yields i32:result = 4 \| %var
<result> = or i32 15, 40 ; yields i32:result = 47		<result> = or i32 15, 40 ; yields i32:result = 47
<result> = or i32 4, 8 ; yields i32:result = 12		<result> = or i32 4, 8 ; yields i32:result = 12

		.. _i_xor:

'``xor``' Instruction		'``xor``' Instruction
^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

::		::

▲ Show 20 Lines • Show All 6,787 Lines • ▼ Show 20 Lines
""""""""""		""""""""""

On some architectures the address of the code to be executed needs to be		On some architectures the address of the code to be executed needs to be
different than the address where the trampoline is actually stored. This		different than the address where the trampoline is actually stored. This
intrinsic returns the executable address corresponding to ``tramp``		intrinsic returns the executable address corresponding to ``tramp``
after performing the required machine specific adjustments. The pointer		after performing the required machine specific adjustments. The pointer
returned can then be :ref:`bitcast and executed <int_trampoline>`.		returned can then be :ref:`bitcast and executed <int_trampoline>`.


		.. _int_vp:

		Vector Predication Intrinsics
		-----------------------------
		VP intrinsics are intended for predicated SIMD/vector code. A typical VP
		operation takes a vector mask and an explicit vector length parameter as in:

		::

		<W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)

		The vector mask parameter (%mask) always has a vector of `i1` type, for example
		`<32 x i1>`. The explicit vector length parameter always has the type `i32` and
		is an unsigned integer value. The explicit vector length parameter (%evl) is either
		the IR constant ``i32 -1`` or %evl is in the range

		::

		0 <= %evl <= W, where W is the number of vector elements

		The VP intrinsic has undefined behavior for any other setting of %evl, for
		example if ``%evl > W``. The explicit vector length (%evl) is only effective
		when it is not the IR constant ``i32 -1``, and, when that is the case, it
		creates a mask, %EVLmask, with all elements ``0 <= i <= %evl`` set to True, and
		all other lanes ``%evl < i <= W`` to False. A new mask %M is calculated with an
		element-wise AND from %mask and %EVLmask:

		::

		M = %mask AND %EVLmask

		A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:

		::

		A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and
		{ undef otherwise

		Optimization Hint
		^^^^^^^^^^^^^^^^^

		Some targets, such as AVX512, do not support the %evl parameter in hardware.
		The use of an effective %evl is disencouraged for those targets. The function
		``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
		has native support for %evl.


		.. _int_vp_add:

		'``llvm.vp.add.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated integer addition of two vectors of integers.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
		of the first and second vector operand on each enabled lane. The result on
		disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = add <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef

		.. _int_vp_sub:

		'``llvm.vp.sub.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated integer subtraction of two vectors of integers.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.sub``' intrinsic performs integer subtraction
		(:ref:`sub <i_sub>`) of the first and second vector operand on each enabled
		lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = sub <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef



		.. _int_vp_mul:

		'``llvm.vp.mul.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated integer multiplication of two vectors of integers.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""
		The '``llvm.vp.mul``' intrinsic performs integer multiplication
		(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
		lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = mul <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_sdiv:

		'``llvm.vp.sdiv.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated, signed division of two vectors of integers.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
		of the first and second vector operand on each enabled lane. The result on
		disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = sdiv <4 x i32> %a, %b
		%also.r = select <4 x ii> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_udiv:

		'``llvm.vp.udiv.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated, unsigned division of two vectors of integers.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.

		Semantics:
		""""""""""

		The '``llvm.vp.udiv``' intrinsic performs unsigned division
		(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
		lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = udiv <4 x i32> %a, %b
		%also.r = select <4 x ii> %mask, <4 x i32> %t, <4 x i32> undef



		.. _int_vp_srem:

		'``llvm.vp.srem.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated computations of the signed remainder of two integer vectors.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
		(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
		lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = srem <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef



		.. _int_vp_urem:

		'``llvm.vp.urem.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Predicated computation of the unsigned remainder of two integer vectors.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
		(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
		lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = urem <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_ashr:

		'``llvm.vp.ashr.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated arithmetic right-shift.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
		(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
		enabled lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = ashr <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_lshr:


		'``llvm.vp.lshr.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated logical right-shift.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.lshr``' intrinsic computes the logical right shift
		(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
		enabled lane. The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = lshr <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_shl:

		'``llvm.vp.shl.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated left shift.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
		the first operand by the second operand on each enabled lane. The result on
		disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = shl <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_or:

		'``llvm.vp.or.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated or.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
		first two operands on each enabled lane. The result on disabled lanes is
		undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = or <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_and:

		'``llvm.vp.and.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated and.


		Arguments:
		""""""""""

		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
		the first two operands on each enabled lane. The result on disabled lanes is
		undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = and <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef


		.. _int_vp_xor:

		'``llvm.vp.xor.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
		declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
		declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

		Overview:
		"""""""""

		Vector-predicated, bitwise xor.


		Arguments:
		""""""""""

		simollAuthorUnsubmitted Done Reply Inline Actions @rkruppe [..] My main point was just that the existing select instruction is not sufficient as the second operation, for essentially the same reason why the VP intrinsics have an EVL argument instead of just the mask. Creating a VP equivalent of select (as already sketched in the other thread) resolves that concern just as well. I agree. The prototype has defined such an `llvm.vp.select` from the get-go. simoll: > @rkruppe > [..] My main point was just that the existing select instruction is not…
		rkruppeUnsubmitted Not Done Reply Inline Actions Oops, missed that / forgot about it. Sorry for the noise. Is there a reason why it's not in the "integer slice" patch? It's not integer-specific, but it seems to fit even less into the other slices. rkruppe: Oops, missed that / forgot about it. Sorry for the noise. Is there a reason why it's not in…
		simollAuthorUnsubmitted Done Reply Inline Actions I wanted to keep the integer patch concise for one. Also, having played around with this for a while now, i think that the signature of `vp.select` should be: llvm.vp.select(<W x i1> %m, %onTrue, %onFalse, i32 %threshold, i32 vlen %evl) meaning that values from %onTrue are selected where %m is true and the lane index is below %threshold. %onFalse is selected otherwise. Lane indices greater-equal %evl are undef as ever. In short: there is just one "merge" operation and no more separate `vp.compose`. simoll: I wanted to keep the integer patch concise for one. Also, having played around with this for a…
		The first two operands and the result have the same vector of integer type. The
		third operand is the vector mask and has the same number of elements as the
		result vector type. The fourth operand is the explicit vector length of the
		operation.

		Semantics:
		""""""""""

		The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
		the first two operands on each enabled lane.
		The result on disabled lanes is undefined.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
		;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

		%t = xor <4 x i32> %a, %b
		%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef



		.. _int_vp_compose:

		'``llvm.vp.compose.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x float> @llvm.vp.compose.v16f32 (<16 x float> <lanes_below_pivot>, <16 x float> <lanes_ge_pivot>, i32 <pivot>, i32 <vector_length>)

		Overview:
		"""""""""

		The compose intrinsic blends two input vectors based on a pivot value.


		Arguments:
		""""""""""

		The first operand is the vector whose elements are selected below the pivot. The second operand is the vector whose values are selected starting from the pivot position. The third operand is the pivot value. The fourth operand it the explicit vector length of the operation


		Semantics:
		""""""""""

		The '``llvm.vp.compose``' intrinsic is designed for conditional blending of two vectors based on a pivot number. All lanes below the pivot are taken from the first operand, all elements at greated and equal positions are taken from the second operand. It is useful for targets that support an explicit vector length and guarantee that vector instructions preserve the contents of vector registers above the AVL of the operation. Other targets may support this intrinsic differently, for example by lowering it into a select with a bitmask that represents the pivot comparison.
		The result of this operation is equivalent to a select with an equivalent predicate mask based on the pivot operand. However, as for all VP intrinsics all lanes above the explicit vector length are undefined.


		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.compose.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %pivot, i32 %evl)
		;; lanes of %r at positions >= %evl are undef

		;; except for %r is equivalent to %also.r
		%tmp = insertelement <4 x i32> undef, %pivot, 0
		%pivot.splat = shufflevector <4 x i32> %tmp, <4 x i32> undef, <4 x i32> zeroinitializer
		%pivot.mask = icmp ult i1 <4 x i32> <i32 0, i32 1, i32 2, i32 3>, %pivot.splat
		%also.r = select <4 x i1> %pivot.mask, <4 x i32> %a, <4 x i32> %b



		.. _int_vp_select:

		'``llvm.vp.select.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax:
		"""""""
		This is an overloaded intrinsic.

		::

		declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <mask>, <16 x i32> <left_op>, <16 x i32> <right_op>, i32 <vector_length>)
		declare <256 x double> @llvm.vp.select.v256f64 (<256 x i1> <mask>, <256 x double> <left_op>, <256 x double> <right_op>, i32 <vector_length>)

		Overview:
		"""""""""

		Conditional select with an explicit vector length.


		Arguments:
		""""""""""

		The first three operand and the result are vector types of the same length. The second and third operand, and the result have the same vector type. The fourth operand is the explicit vector length.

		Semantics:
		""""""""""

		The '``llvm.vp.select``' intrinsic performs conditional select (:ref:`select <i_select>`) of the second and thirs vector operand on each enabled lane.
		If the explicit vector length (the fourth operand) is effective, the result is undefined on lanes at positions greater-equal-than the explicit vector length.

		Examples:
		"""""""""

		.. code-block:: llvm

		%r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %mask, <4 x i32> %onTrue, <4 x i32> %onFalse, i32 %avl)
		;; For all lanes below %avl, %r is lane-wise equivalent to %also.r

		%also.r = select <4 x i1> %mask, <4 x i32> %onTrue, <4 x i32> %onFalse



.. _int_mload_mstore:		.. _int_mload_mstore:

Masked Vector Load and Store Intrinsics		Masked Vector Load and Store Intrinsics
---------------------------------------		---------------------------------------

LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.		LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.

.. _int_mload:		.. _int_mload:
▲ Show 20 Lines • Show All 3,674 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/InstructionSimplify.h

	Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	struct LoopStandardAnalysisResults;			struct LoopStandardAnalysisResults;
	class OptimizationRemarkEmitter;			class OptimizationRemarkEmitter;
	class Pass;			class Pass;
	class TargetLibraryInfo;			class TargetLibraryInfo;
	class Type;			class Type;
	class Value;			class Value;
	class MDNode;			class MDNode;
	class BinaryOperator;			class BinaryOperator;
				class VPIntrinsic;
				namespace PatternMatch {
				struct PredicatedContext;
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - struct PredicatedContext; +struct PredicatedContext; Lint: Pre-merge checks: clang-format: please reformat the code ``` - struct PredicatedContext; +struct…
				}

	/// InstrInfoQuery provides an interface to query additional information for			/// InstrInfoQuery provides an interface to query additional information for
	/// instructions like metadata or keywords like nsw, which provides conservative			/// instructions like metadata or keywords like nsw, which provides conservative
	/// results if the users specified it is safe to use.			/// results if the users specified it is safe to use.
	struct InstrInfoQuery {			struct InstrInfoQuery {
	InstrInfoQuery(bool UMD) : UseInstrInfo(UMD) {}			InstrInfoQuery(bool UMD) : UseInstrInfo(UMD) {}
	InstrInfoQuery() : UseInstrInfo(true) {}			InstrInfoQuery() : UseInstrInfo(true) {}
	bool UseInstrInfo = true;			bool UseInstrInfo = true;
	▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	/// Given operands for an FAdd, fold the result or return null.			/// Given operands for an FAdd, fold the result or return null.
	Value SimplifyFAddInst(Value LHS, Value *RHS, FastMathFlags FMF,			Value SimplifyFAddInst(Value LHS, Value *RHS, FastMathFlags FMF,
	const SimplifyQuery &Q);			const SimplifyQuery &Q);

	/// Given operands for an FSub, fold the result or return null.			/// Given operands for an FSub, fold the result or return null.
	Value SimplifyFSubInst(Value LHS, Value *RHS, FastMathFlags FMF,			Value SimplifyFSubInst(Value LHS, Value *RHS, FastMathFlags FMF,
	const SimplifyQuery &Q);			const SimplifyQuery &Q);

				/// Given operands for an FSub, fold the result or return null.
				Value SimplifyFSubInst(Value LHS, Value *RHS, FastMathFlags FMF,
				const SimplifyQuery &Q);
				Value SimplifyPredicatedFSubInst(Value LHS, Value *RHS,
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Value SimplifyPredicatedFSubInst(Value LHS, Value RHS, - FastMathFlags FMF, const SimplifyQuery &Q, - PatternMatch::PredicatedContext & PC); +Value SimplifyPredicatedFSubInst(Value LHS, Value RHS, FastMathFlags FMF, + const SimplifyQuery &Q, + PatternMatch::PredicatedContext &PC); Lint: Pre-merge checks: clang-format: please reformat the code ``` -Value SimplifyPredicatedFSubInst(Value LHS, Value…
				FastMathFlags FMF, const SimplifyQuery &Q,
				PatternMatch::PredicatedContext & PC);

	/// Given operands for an FMul, fold the result or return null.			/// Given operands for an FMul, fold the result or return null.
	Value SimplifyFMulInst(Value LHS, Value *RHS, FastMathFlags FMF,			Value SimplifyFMulInst(Value LHS, Value *RHS, FastMathFlags FMF,
	const SimplifyQuery &Q);			const SimplifyQuery &Q);

	/// Given operands for the multiplication of a FMA, fold the result or return			/// Given operands for the multiplication of a FMA, fold the result or return
	/// null. In contrast to SimplifyFMulInst, this function will not perform			/// null. In contrast to SimplifyFMulInst, this function will not perform
	/// simplifications whose unrounded results differ when rounded to the argument			/// simplifications whose unrounded results differ when rounded to the argument
	/// type.			/// type.
	▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines

	/// Given a callsite, fold the result or return null.			/// Given a callsite, fold the result or return null.
	Value SimplifyCall(CallBase Call, const SimplifyQuery &Q);			Value SimplifyCall(CallBase Call, const SimplifyQuery &Q);

	/// Given an operand for a Freeze, see if we can fold the result.			/// Given an operand for a Freeze, see if we can fold the result.
	/// If not, this returns null.			/// If not, this returns null.
	Value SimplifyFreezeInst(Value Op, const SimplifyQuery &Q);			Value SimplifyFreezeInst(Value Op, const SimplifyQuery &Q);

				/// Given a VP intrinsic function, fold the result or return null.
				Value *SimplifyVPIntrinsic(VPIntrinsic & VPInst, const SimplifyQuery &Q);
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Value SimplifyVPIntrinsic(VPIntrinsic & VPInst, const SimplifyQuery &Q); +Value SimplifyVPIntrinsic(VPIntrinsic &VPInst, const SimplifyQuery &Q); Lint: Pre-merge checks: clang-format: please reformat the code ``` -Value *SimplifyVPIntrinsic(VPIntrinsic & VPInst…

	/// See if we can compute a simplified version of this instruction. If not,			/// See if we can compute a simplified version of this instruction. If not,
	/// return null.			/// return null.
	Value SimplifyInstruction(Instruction I, const SimplifyQuery &Q,			Value SimplifyInstruction(Instruction I, const SimplifyQuery &Q,
	OptimizationRemarkEmitter *ORE = nullptr);			OptimizationRemarkEmitter *ORE = nullptr);

	/// Replace all uses of 'I' with 'SimpleV' and simplify the uses recursively.			/// Replace all uses of 'I' with 'SimpleV' and simplify the uses recursively.
	///			///
	/// This first performs a normal RAUW of I with SimpleV. It then recursively			/// This first performs a normal RAUW of I with SimpleV. It then recursively
	Show All 35 Lines

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show All 39 Lines
}		}

class AssumptionCache;		class AssumptionCache;
class BlockFrequencyInfo;		class BlockFrequencyInfo;
class BranchInst;		class BranchInst;
class Function;		class Function;
class GlobalValue;		class GlobalValue;
class IntrinsicInst;		class IntrinsicInst;
		class PredicatedInstruction;
class LoadInst;		class LoadInst;
class LoopAccessInfo;		class LoopAccessInfo;
class Loop;		class Loop;
class ProfileSummaryInfo;		class ProfileSummaryInfo;
class SCEV;		class SCEV;
class ScalarEvolution;		class ScalarEvolution;
class StoreInst;		class StoreInst;
class SwitchInst;		class SwitchInst;
▲ Show 20 Lines • Show All 1,084 Lines • ▼ Show 20 Lines	struct ReductionFlags {
bool NoNaN; ///< If op is an fp min/max, whether NaNs may be present.		bool NoNaN; ///< If op is an fp min/max, whether NaNs may be present.
};		};

/// \returns True if the target wants to handle the given reduction idiom in		/// \returns True if the target wants to handle the given reduction idiom in
/// the intrinsics form instead of the shuffle form.		/// the intrinsics form instead of the shuffle form.
bool useReductionIntrinsic(unsigned Opcode, Type *Ty,		bool useReductionIntrinsic(unsigned Opcode, Type *Ty,
ReductionFlags Flags) const;		ReductionFlags Flags) const;

		/// \returns True if the vector length parameter should be folded into the
		/// vector mask.
		bool
		shouldFoldVectorLengthIntoMask(const PredicatedInstruction &PredInst) const;

		/// \returns False if this VP op should be replaced by a non-VP op or an
		/// unpredicated op plus a select.
		bool supportsVPOperation(const PredicatedInstruction &PredInst) const;

/// \returns True if the target wants to expand the given reduction intrinsic		/// \returns True if the target wants to expand the given reduction intrinsic
/// into a shuffle sequence.		/// into a shuffle sequence.
bool shouldExpandReduction(const IntrinsicInst *II) const;		bool shouldExpandReduction(const IntrinsicInst *II) const;

/// \returns the size cost of rematerializing a GlobalValue address relative		/// \returns the size cost of rematerializing a GlobalValue address relative
/// to a stack reload.		/// to a stack reload.
unsigned getGISelRematGlobalCost() const;		unsigned getGISelRematGlobalCost() const;

		/// \name Vector Predication Information
		/// @{
		/// Whether the target supports the %evl parameter of VP intrinsic efficiently in hardware.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// Whether the target supports the %evl parameter of VP intrinsic efficiently in hardware. - /// (see LLVM Language Reference - "Vector Predication Intrinsics") - /// Use of %evl is disencouraged when that is not the case. + /// Whether the target supports the %evl parameter of VP intrinsic efficiently + /// in hardware. (see LLVM Language Reference - "Vector Predication + /// Intrinsics") Use of %evl is disencouraged when that is not the case. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// Whether the target supports the %evl…
		/// (see LLVM Language Reference - "Vector Predication Intrinsics")
		/// Use of %evl is disencouraged when that is not the case.
		bool hasActiveVectorLength() const;

		/// @}

/// @}		/// @}

private:		private:
/// Estimate the latency of specified instruction.		/// Estimate the latency of specified instruction.
/// Returns 1 as the default value.		/// Returns 1 as the default value.
int getInstructionLatency(const Instruction *I) const;		int getInstructionLatency(const Instruction *I) const;

/// Returns the expected throughput cost of the instruction.		/// Returns the expected throughput cost of the instruction.
▲ Show 20 Lines • Show All 227 Lines • ▼ Show 20 Lines	virtual bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,
unsigned Alignment,		unsigned Alignment,
unsigned AddrSpace) const = 0;		unsigned AddrSpace) const = 0;
virtual unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,		virtual unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const = 0;		VectorType *VecTy) const = 0;
virtual unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		virtual unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const = 0;		VectorType *VecTy) const = 0;
		virtual bool shouldFoldVectorLengthIntoMask(
		const PredicatedInstruction &PredInst) const = 0;
		virtual bool
		supportsVPOperation(const PredicatedInstruction &PredInst) const = 0;
virtual bool useReductionIntrinsic(unsigned Opcode, Type *Ty,		virtual bool useReductionIntrinsic(unsigned Opcode, Type *Ty,
ReductionFlags) const = 0;		ReductionFlags) const = 0;
virtual bool shouldExpandReduction(const IntrinsicInst *II) const = 0;		virtual bool shouldExpandReduction(const IntrinsicInst *II) const = 0;
virtual unsigned getGISelRematGlobalCost() const = 0;		virtual unsigned getGISelRematGlobalCost() const = 0;
		virtual bool hasActiveVectorLength() const = 0;
virtual int getInstructionLatency(const Instruction *I) = 0;		virtual int getInstructionLatency(const Instruction *I) = 0;
};		};

template <typename T>		template <typename T>
class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {		class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
T Impl;		T Impl;

public:		public:
▲ Show 20 Lines • Show All 456 Lines • ▼ Show 20 Lines	unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
VectorType *VecTy) const override {		VectorType *VecTy) const override {
return Impl.getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);		return Impl.getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);
}		}
unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const override {		VectorType *VecTy) const override {
return Impl.getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);		return Impl.getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);
}		}
		bool shouldFoldVectorLengthIntoMask(
		const PredicatedInstruction &PredInst) const override {
		return Impl.shouldFoldVectorLengthIntoMask(PredInst);
		}
		bool
		supportsVPOperation(const PredicatedInstruction &PredInst) const override {
		return Impl.supportsVPOperation(PredInst);
		}
bool useReductionIntrinsic(unsigned Opcode, Type *Ty,		bool useReductionIntrinsic(unsigned Opcode, Type *Ty,
ReductionFlags Flags) const override {		ReductionFlags Flags) const override {
return Impl.useReductionIntrinsic(Opcode, Ty, Flags);		return Impl.useReductionIntrinsic(Opcode, Ty, Flags);
}		}
bool shouldExpandReduction(const IntrinsicInst *II) const override {		bool shouldExpandReduction(const IntrinsicInst *II) const override {
return Impl.shouldExpandReduction(II);		return Impl.shouldExpandReduction(II);
}		}

unsigned getGISelRematGlobalCost() const override {		unsigned getGISelRematGlobalCost() const override {
return Impl.getGISelRematGlobalCost();		return Impl.getGISelRematGlobalCost();
}		}

		bool hasActiveVectorLength() const override {
		return Impl.hasActiveVectorLength();
		}

int getInstructionLatency(const Instruction *I) override {		int getInstructionLatency(const Instruction *I) override {
return Impl.getInstructionLatency(I);		return Impl.getInstructionLatency(I);
}		}
};		};

template <typename T>		template <typename T>
TargetTransformInfo::TargetTransformInfo(T Impl)		TargetTransformInfo::TargetTransformInfo(T Impl)
: TTIImpl(new Model<T>(Impl)) {}		: TTIImpl(new Model<T>(Impl)) {}
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 599 Lines • ▼ Show 20 Lines	public:
}		}

unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const {		VectorType *VecTy) const {
return VF;		return VF;
}		}

		bool
		shouldFoldVectorLengthIntoMask(const PredicatedInstruction &PredInst) const {
		return true;
		}

		bool supportsVPOperation(const PredicatedInstruction &PredInst) const {
		return false;
		}

bool useReductionIntrinsic(unsigned Opcode, Type *Ty,		bool useReductionIntrinsic(unsigned Opcode, Type *Ty,
TTI::ReductionFlags Flags) const {		TTI::ReductionFlags Flags) const {
return false;		return false;
}		}

bool shouldExpandReduction(const IntrinsicInst *II) const {		bool shouldExpandReduction(const IntrinsicInst *II) const {
return true;		return true;
}		}

unsigned getGISelRematGlobalCost() const {		unsigned getGISelRematGlobalCost() const {
return 1;		return 1;
}		}

		bool hasActiveVectorLength() const {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool hasActiveVectorLength() const { - return false; - } + bool hasActiveVectorLength() const { return false; } Lint: Pre-merge checks: clang-format: please reformat the code ``` - bool hasActiveVectorLength() const { - return…
		return false;
		}

protected:		protected:
// Obtain the minimum required size to hold the value (without the sign)		// Obtain the minimum required size to hold the value (without the sign)
// In case of a vector it returns the min required size for one element.		// In case of a vector it returns the min required size for one element.
unsigned minRequiredElementSize(const Value* Val, bool &isSigned) {		unsigned minRequiredElementSize(const Value* Val, bool &isSigned) {
if (isa<ConstantDataVector>(Val) \|\| isa<ConstantVector>(Val)) {		if (isa<ConstantDataVector>(Val) \|\| isa<ConstantVector>(Val)) {
const auto* VectorValue = cast<Constant>(Val);		const auto* VectorValue = cast<Constant>(Val);

// In case of a vector need to pick the max between the min		// In case of a vector need to pick the max between the min
▲ Show 20 Lines • Show All 314 Lines • Show Last 20 Lines

llvm/include/llvm/Bitcode/LLVMBitCodes.h

Show First 20 Lines • Show All 627 Lines • ▼ Show 20 Lines	enum AttributeKindCodes {
ATTR_KIND_OPT_FOR_FUZZING = 57,		ATTR_KIND_OPT_FOR_FUZZING = 57,
ATTR_KIND_SHADOWCALLSTACK = 58,		ATTR_KIND_SHADOWCALLSTACK = 58,
ATTR_KIND_SPECULATIVE_LOAD_HARDENING = 59,		ATTR_KIND_SPECULATIVE_LOAD_HARDENING = 59,
ATTR_KIND_IMMARG = 60,		ATTR_KIND_IMMARG = 60,
ATTR_KIND_WILLRETURN = 61,		ATTR_KIND_WILLRETURN = 61,
ATTR_KIND_NOFREE = 62,		ATTR_KIND_NOFREE = 62,
ATTR_KIND_NOSYNC = 63,		ATTR_KIND_NOSYNC = 63,
ATTR_KIND_SANITIZE_MEMTAG = 64,		ATTR_KIND_SANITIZE_MEMTAG = 64,
		ATTR_KIND_MASK = 65,
		ATTR_KIND_VECTORLENGTH = 66,
		ATTR_KIND_PASSTHRU = 67,
};		};

enum ComdatSelectionKindCodes {		enum ComdatSelectionKindCodes {
COMDAT_SELECTION_KIND_ANY = 1,		COMDAT_SELECTION_KIND_ANY = 1,
COMDAT_SELECTION_KIND_EXACT_MATCH = 2,		COMDAT_SELECTION_KIND_EXACT_MATCH = 2,
COMDAT_SELECTION_KIND_LARGEST = 3,		COMDAT_SELECTION_KIND_LARGEST = 3,
COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,		COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,
COMDAT_SELECTION_KIND_SAME_SIZE = 5,		COMDAT_SELECTION_KIND_SAME_SIZE = 5,
Show All 14 Lines

llvm/include/llvm/CodeGen/ExpandVectorPredication.h

This file was added.

				//===----- ExpandVectorPredication.h - Expand vector predication --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_EXPANDVECTORPREDICATION_H
				#define LLVM_CODEGEN_EXPANDVECTORPREDICATION_H

				#include "llvm/IR/PassManager.h"

				namespace llvm {

				class ExpandVectorPredicationPass
				: public PassInfoMixin<ExpandVectorPredicationPass> {
				public:
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
				};
				} // end namespace llvm

				#endif // LLVM_CODEGEN_EXPANDVECTORPREDICATION_H

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show All 31 Lines	namespace ISD {
/// instruction sets as much as possible, and only use target-dependent		/// instruction sets as much as possible, and only use target-dependent
/// operators when they have special requirements.		/// operators when they have special requirements.
///		///
/// Finally, during and after selection proper, SNodes may use special		/// Finally, during and after selection proper, SNodes may use special
/// operator codes that correspond directly with MachineInstr opcodes. These		/// operator codes that correspond directly with MachineInstr opcodes. These
/// are used to represent selected instructions. See the isMachineOpcode()		/// are used to represent selected instructions. See the isMachineOpcode()
/// and getMachineOpcode() member functions of SDNode.		/// and getMachineOpcode() member functions of SDNode.
///		///
enum NodeType {		enum NodeType {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - enum NodeType { - /// DELETED_NODE - This is an illegal value that is used to catch - /// errors. This opcode is not a legal opcode for any node. - DELETED_NODE, - - /// EntryToken - This is the marker used to indicate the start of a region. - EntryToken, - - /// TokenFactor - This node takes multiple tokens as input and produces a - /// single token result. This is used to represent the fact that the operand - /// operators are independent of each other. - TokenFactor, - - /// AssertSext, AssertZext - These nodes record if a register contains a - /// value that has already been zero or sign extended from a narrower type. - /// These nodes take two operands. The first is the node that has already - /// been extended, and the second is a value type node indicating the width - /// of the extension - AssertSext, AssertZext, - - /// Various leaf nodes. - BasicBlock, VALUETYPE, CONDCODE, Register, RegisterMask, - Constant, ConstantFP, - GlobalAddress, GlobalTLSAddress, FrameIndex, - JumpTable, ConstantPool, ExternalSymbol, BlockAddress, - - /// The address of the GOT - GLOBAL_OFFSET_TABLE, - - /// FRAMEADDR, RETURNADDR - These nodes represent llvm.frameaddress and - /// llvm.returnaddress on the DAG. These nodes take one operand, the index - /// of the frame or return address to return. An index of zero corresponds - /// to the current function's frame or return address, an index of one to - /// the parent's frame or return address, and so on. - FRAMEADDR, RETURNADDR, ADDROFRETURNADDR, SPONENTRY, - - /// LOCAL_RECOVER - Represents the llvm.localrecover intrinsic. - /// Materializes the offset from the local object pointer of another - /// function to a particular local object passed to llvm.localescape. The - /// operand is the MCSymbol label used to represent this offset, since - /// typically the offset is not known until after code generation of the - /// parent. - LOCAL_RECOVER, - - /// READ_REGISTER, WRITE_REGISTER - This node represents llvm.register on - /// the DAG, which implements the named register global variables extension. - READ_REGISTER, - WRITE_REGISTER, - - /// FRAME_TO_ARGS_OFFSET - This node represents offset from frame pointer to - /// first (possible) on-stack argument. This is needed for correct stack - /// adjustment during unwind. - FRAME_TO_ARGS_OFFSET, - - /// EH_DWARF_CFA - This node represents the pointer to the DWARF Canonical - /// Frame Address (CFA), generally the value of the stack pointer at the - /// call site in the previous frame. - EH_DWARF_CFA, - - /// OUTCHAIN = EH_RETURN(INCHAIN, OFFSET, HANDLER) - This node represents - /// 'eh_return' gcc dwarf builtin, which is used to return from - /// exception. The general meaning is: adjust stack by OFFSET and pass - /// execution to HANDLER. Many platform-related details also :) - EH_RETURN, - - /// RESULT, OUTCHAIN = EH_SJLJ_SETJMP(INCHAIN, buffer) - /// This corresponds to the eh.sjlj.setjmp intrinsic. - /// It takes an input chain and a pointer to the jump buffer as inputs - /// and returns an outchain. - EH_SJLJ_SETJMP, - - /// OUTCHAIN = EH_SJLJ_LONGJMP(INCHAIN, buffer) - /// This corresponds to the eh.sjlj.longjmp intrinsic. - /// It takes an input chain and a pointer to the jump buffer as inputs - /// and returns an outchain. - EH_SJLJ_LONGJMP, - - /// OUTCHAIN = EH_SJLJ_SETUP_DISPATCH(INCHAIN) - /// The target initializes the dispatch table here. - EH_SJLJ_SETUP_DISPATCH, - - /// TargetConstant* - Like Constant, but the DAG does not do any folding, - /// simplification, or lowering of the constant. They are used for constants - /// which are known to fit in the immediate fields of their users, or for - /// carrying magic numbers which are not values which need to be - /// materialized in registers. - TargetConstant, - TargetConstantFP, - - /// TargetGlobalAddress - Like GlobalAddress, but the DAG does no folding or - /// anything else with this node, and this is valid in the target-specific - /// dag, turning into a GlobalAddress operand. - TargetGlobalAddress, - TargetGlobalTLSAddress, - TargetFrameIndex, - TargetJumpTable, - TargetConstantPool, - TargetExternalSymbol, - TargetBlockAddress, - - MCSymbol, - - /// TargetIndex - Like a constant pool entry, but with completely - /// target-dependent semantics. Holds target flags, a 32-bit index, and a - /// 64-bit index. Targets can use this however they like. - TargetIndex, - - /// RESULT = INTRINSIC_WO_CHAIN(INTRINSICID, arg1, arg2, ...) - /// This node represents a target intrinsic function with no side effects. - /// The first operand is the ID number of the intrinsic from the - /// llvm::Intrinsic namespace. The operands to the intrinsic follow. The - /// node returns the result of the intrinsic. - INTRINSIC_WO_CHAIN, - - /// RESULT,OUTCHAIN = INTRINSIC_W_CHAIN(INCHAIN, INTRINSICID, arg1, ...) - /// This node represents a target intrinsic function with side effects that - /// returns a result. The first operand is a chain pointer. The second is - /// the ID number of the intrinsic from the llvm::Intrinsic namespace. The - /// operands to the intrinsic follow. The node has two results, the result - /// of the intrinsic and an output chain. - INTRINSIC_W_CHAIN, - - /// OUTCHAIN = INTRINSIC_VOID(INCHAIN, INTRINSICID, arg1, arg2, ...) - /// This node represents a target intrinsic function with side effects that - /// does not return a result. The first operand is a chain pointer. The - /// second is the ID number of the intrinsic from the llvm::Intrinsic - /// namespace. The operands to the intrinsic follow. - INTRINSIC_VOID, - - /// CopyToReg - This node has three operands: a chain, a register number to - /// set to this value, and a value. - CopyToReg, - - /// CopyFromReg - This node indicates that the input value is a virtual or - /// physical register that is defined outside of the scope of this - /// SelectionDAG. The register is available from the RegisterSDNode object. - CopyFromReg, - - /// UNDEF - An undefined node. - UNDEF, - - /// EXTRACT_ELEMENT - This is used to get the lower or upper (determined by - /// a Constant, which is required to be operand #1) half of the integer or - /// float value specified as operand #0. This is only for use before - /// legalization, for values that will be broken into multiple registers. - EXTRACT_ELEMENT, - - /// BUILD_PAIR - This is the opposite of EXTRACT_ELEMENT in some ways. - /// Given two values of the same integer value type, this produces a value - /// twice as big. Like EXTRACT_ELEMENT, this can only be used before - /// legalization. The lower part of the composite value should be in - /// element 0 and the upper part should be in element 1. - BUILD_PAIR, - - /// MERGE_VALUES - This node takes multiple discrete operands and returns - /// them all as its individual results. This nodes has exactly the same - /// number of inputs and outputs. This node is useful for some pieces of the - /// code generator that want to think about a single node with multiple - /// results, not multiple nodes. - MERGE_VALUES, - - /// Simple integer binary arithmetic operators. - ADD, SUB, MUL, SDIV, UDIV, SREM, UREM, - VP_ADD, VP_SUB, VP_MUL, VP_SDIV, VP_UDIV, VP_SREM, VP_UREM, - - /// SMUL_LOHI/UMUL_LOHI - Multiply two integers of type iN, producing - /// a signed/unsigned value of type i[2N], and return the full value as - /// two results, each of type iN. - SMUL_LOHI, UMUL_LOHI, - - /// SDIVREM/UDIVREM - Divide two integers and produce both a quotient and - /// remainder result. - SDIVREM, UDIVREM, - - /// CARRY_FALSE - This node is used when folding other nodes, - /// like ADDC/SUBC, which indicate the carry result is always false. - CARRY_FALSE, - - /// Carry-setting nodes for multiple precision addition and subtraction. - /// These nodes take two operands of the same value type, and produce two - /// results. The first result is the normal add or sub result, the second - /// result is the carry flag result. - /// FIXME: These nodes are deprecated in favor of ADDCARRY and SUBCARRY. - /// They are kept around for now to provide a smooth transition path - /// toward the use of ADDCARRY/SUBCARRY and will eventually be removed. - ADDC, SUBC, - - /// Carry-using nodes for multiple precision addition and subtraction. These - /// nodes take three operands: The first two are the normal lhs and rhs to - /// the add or sub, and the third is the input carry flag. These nodes - /// produce two results; the normal result of the add or sub, and the output - /// carry flag. These nodes both read and write a carry flag to allow them - /// to them to be chained together for add and sub of arbitrarily large - /// values. - ADDE, SUBE, - - /// Carry-using nodes for multiple precision addition and subtraction. - /// These nodes take three operands: The first two are the normal lhs and - /// rhs to the add or sub, and the third is a boolean indicating if there - /// is an incoming carry. These nodes produce two results: the normal - /// result of the add or sub, and the output carry so they can be chained - /// together. The use of this opcode is preferable to adde/sube if the - /// target supports it, as the carry is a regular value rather than a - /// glue, which allows further optimisation. - ADDCARRY, SUBCARRY, - - /// RESULT, BOOL = [SU]ADDO(LHS, RHS) - Overflow-aware nodes for addition. - /// These nodes take two operands: the normal LHS and RHS to the add. They - /// produce two results: the normal result of the add, and a boolean that - /// indicates if an overflow occurred (not a flag, because it may be store - /// to memory, etc.). If the type of the boolean is not i1 then the high - /// bits conform to getBooleanContents. - /// These nodes are generated from llvm.[su]add.with.overflow intrinsics. - SADDO, UADDO, - - /// Same for subtraction. - SSUBO, USUBO, - - /// Same for multiplication. - SMULO, UMULO, - - /// RESULT = [US]ADDSAT(LHS, RHS) - Perform saturation addition on 2 - /// integers with the same bit width (W). If the true value of LHS + RHS - /// exceeds the largest value that can be represented by W bits, the - /// resulting value is this maximum value. Otherwise, if this value is less - /// than the smallest value that can be represented by W bits, the - /// resulting value is this minimum value. - SADDSAT, UADDSAT, - - /// RESULT = [US]SUBSAT(LHS, RHS) - Perform saturation subtraction on 2 - /// integers with the same bit width (W). If the true value of LHS - RHS - /// exceeds the largest value that can be represented by W bits, the - /// resulting value is this maximum value. Otherwise, if this value is less - /// than the smallest value that can be represented by W bits, the - /// resulting value is this minimum value. - SSUBSAT, USUBSAT, - - /// RESULT = [US]MULFIX(LHS, RHS, SCALE) - Perform fixed point multiplication on - /// 2 integers with the same width and scale. SCALE represents the scale of - /// both operands as fixed point numbers. This SCALE parameter must be a - /// constant integer. A scale of zero is effectively performing - /// multiplication on 2 integers. - SMULFIX, UMULFIX, - - /// Same as the corresponding unsaturated fixed point instructions, but the - /// result is clamped between the min and max values representable by the - /// bits of the first 2 operands. - SMULFIXSAT, UMULFIXSAT, - - /// RESULT = [US]DIVFIX(LHS, RHS, SCALE) - Perform fixed point division on - /// 2 integers with the same width and scale. SCALE represents the scale - /// of both operands as fixed point numbers. This SCALE parameter must be a - /// constant integer. - SDIVFIX, UDIVFIX, - - /// Same as the corresponding unsaturated fixed point instructions, but the - /// result is clamped between the min and max values representable by the - /// bits of the first 2 operands. - SDIVFIXSAT, UDIVFIXSAT, - - /// Simple binary floating point operators. - FADD, FSUB, FMUL, FDIV, FREM, - VP_FADD, VP_FSUB, VP_FMUL, VP_FDIV, VP_FREM, - - /// Constrained versions of the binary floating point operators. - /// These will be lowered to the simple operators before final selection. - /// They are used to limit optimizations while the DAG is being - /// optimized. - STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM, - STRICT_FMA, - - /// Constrained versions of libm-equivalent floating point intrinsics. - /// These will be lowered to the equivalent non-constrained pseudo-op - /// (or expanded to the equivalent library call) before final selection. - /// They are used to limit optimizations while the DAG is being optimized. - STRICT_FSQRT, STRICT_FPOW, STRICT_FPOWI, STRICT_FSIN, STRICT_FCOS, - STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2, - STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM, - STRICT_FCEIL, STRICT_FFLOOR, STRICT_FROUND, STRICT_FTRUNC, - STRICT_LROUND, STRICT_LLROUND, STRICT_LRINT, STRICT_LLRINT, - STRICT_FMAXIMUM, STRICT_FMINIMUM, - - /// STRICT_FP_TO_[US]INT - Convert a floating point value to a signed or - /// unsigned integer. These have the same semantics as fptosi and fptoui - /// in IR. - /// They are used to limit optimizations while the DAG is being optimized. - STRICT_FP_TO_SINT, - STRICT_FP_TO_UINT, - - /// STRICT_[US]INT_TO_FP - Convert a signed or unsigned integer to - /// a floating point value. These have the same semantics as sitofp and - /// uitofp in IR. - /// They are used to limit optimizations while the DAG is being optimized. - STRICT_SINT_TO_FP, - STRICT_UINT_TO_FP, - - /// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating - /// point type down to the precision of the destination VT. TRUNC is a - /// flag, which is always an integer that is zero or one. If TRUNC is 0, - /// this is a normal rounding, if it is 1, this FP_ROUND is known to not - /// change the value of Y. - /// - /// The TRUNC = 1 case is used in cases where we know that the value will - /// not be modified by the node, because Y is not using any of the extra - /// precision of source type. This allows certain transformations like - /// STRICT_FP_EXTEND(STRICT_FP_ROUND(X,1)) -> X which are not safe for - /// STRICT_FP_EXTEND(STRICT_FP_ROUND(X,0)) because the extra bits aren't - /// removed. - /// It is used to limit optimizations while the DAG is being optimized. - STRICT_FP_ROUND, - - /// X = STRICT_FP_EXTEND(Y) - Extend a smaller FP type into a larger FP - /// type. - /// It is used to limit optimizations while the DAG is being optimized. - STRICT_FP_EXTEND, - - /// STRICT_FSETCC/STRICT_FSETCCS - Constrained versions of SETCC, used - /// for floating-point operands only. STRICT_FSETCC performs a quiet - /// comparison operation, while STRICT_FSETCCS performs a signaling - /// comparison operation. - STRICT_FSETCC, STRICT_FSETCCS, - - /// FMA - Perform a * b + c with no intermediate rounding step. - FMA, - VP_FMA, - - /// FMAD - Perform a * b + c, while getting the same result as the - /// separately rounded operations. - FMAD, - - /// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This - /// DAG node does not require that X and Y have the same type, just that - /// they are both floating point. X and the result must have the same type. - /// FCOPYSIGN(f32, f64) is allowed. - FCOPYSIGN, - - /// INT = FGETSIGN(FP) - Return the sign bit of the specified floating point - /// value as an integer 0/1 value. - FGETSIGN, - - /// Returns platform specific canonical encoding of a floating point number. - FCANONICALIZE, - - /// BUILD_VECTOR(ELT0, ELT1, ELT2, ELT3,...) - Return a vector with the - /// specified, possibly variable, elements. The number of elements is - /// required to be a power of two. The types of the operands must all be - /// the same and must match the vector element type, except that integer - /// types are allowed to be larger than the element type, in which case - /// the operands are implicitly truncated. - BUILD_VECTOR, - - /// INSERT_VECTOR_ELT(VECTOR, VAL, IDX) - Returns VECTOR with the element - /// at IDX replaced with VAL. If the type of VAL is larger than the vector - /// element type then VAL is truncated before replacement. - INSERT_VECTOR_ELT, - - /// EXTRACT_VECTOR_ELT(VECTOR, IDX) - Returns a single element from VECTOR - /// identified by the (potentially variable) element number IDX. If the - /// return type is an integer type larger than the element type of the - /// vector, the result is extended to the width of the return type. In - /// that case, the high bits are undefined. - EXTRACT_VECTOR_ELT, - - /// CONCAT_VECTORS(VECTOR0, VECTOR1, ...) - Given a number of values of - /// vector type with the same length and element type, this produces a - /// concatenated vector result value, with length equal to the sum of the - /// lengths of the input vectors. - CONCAT_VECTORS, - - /// INSERT_SUBVECTOR(VECTOR1, VECTOR2, IDX) - Returns a vector - /// with VECTOR2 inserted into VECTOR1 at the (potentially - /// variable) element number IDX, which must be a multiple of the - /// VECTOR2 vector length. The elements of VECTOR1 starting at - /// IDX are overwritten with VECTOR2. Elements IDX through - /// vector_length(VECTOR2) must be valid VECTOR1 indices. - INSERT_SUBVECTOR, - - /// EXTRACT_SUBVECTOR(VECTOR, IDX) - Returns a subvector from VECTOR (an - /// vector value) starting with the element number IDX, which must be a - /// constant multiple of the result vector length. - EXTRACT_SUBVECTOR, - - /// VECTOR_SHUFFLE(VEC1, VEC2) - Returns a vector, of the same type as - /// VEC1/VEC2. A VECTOR_SHUFFLE node also contains an array of constant int - /// values that indicate which value (or undef) each result element will - /// get. These constant ints are accessible through the - /// ShuffleVectorSDNode class. This is quite similar to the Altivec - /// 'vperm' instruction, except that the indices must be constants and are - /// in terms of the element size of VEC1/VEC2, not in terms of bytes. - VECTOR_SHUFFLE, - - /// VP_VSHIFT(VEC1, AMOUNT, MASK, VLEN) - Returns a vector, of the same type as - /// VEC1. AMOUNT is an integer value. The returned vector is equivalent - /// to VEC1 shifted by AMOUNT (RETURNED_VEC[idx] = VEC1[idx + AMOUNT]). - VP_VSHIFT, - - /// VP_COMPRESS(VEC1, MASK, VLEN) - Returns a vector, of the same type as - /// VEC1. - VP_COMPRESS, - - /// VP_EXPAND(VEC1, MASK, VLEN) - Returns a vector, of the same type as - /// VEC1. - VP_EXPAND, - - /// SCALAR_TO_VECTOR(VAL) - This represents the operation of loading a - /// scalar value into element 0 of the resultant vector type. The top - /// elements 1 to N-1 of the N-element vector are undefined. The type - /// of the operand must match the vector element type, except when they - /// are integer types. In this case the operand is allowed to be wider - /// than the vector element type, and is implicitly truncated to it. - SCALAR_TO_VECTOR, - - /// SPLAT_VECTOR(VAL) - Returns a vector with the scalar value VAL - /// duplicated in all lanes. The type of the operand must match the vector - /// element type, except when they are integer types. In this case the - /// operand is allowed to be wider than the vector element type, and is - /// implicitly truncated to it. - SPLAT_VECTOR, - - /// MULHU/MULHS - Multiply high - Multiply two integers of type iN, - /// producing an unsigned/signed value of type i[2N], then return the top - /// part. - MULHU, MULHS, - - /// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned - /// integers. - SMIN, SMAX, UMIN, UMAX, - - /// Bitwise operators - logical and, logical or, logical xor. - AND, OR, XOR, - VP_AND, VP_OR, VP_XOR, - - /// ABS - Determine the unsigned absolute value of a signed integer value of - /// the same bitwidth. - /// Note: A value of INT_MIN will return INT_MIN, no saturation or overflow - /// is performed. - ABS, - - /// Shift and rotation operations. After legalization, the type of the - /// shift amount is known to be TLI.getShiftAmountTy(). Before legalization - /// the shift amount can be any type, but care must be taken to ensure it is - /// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before - /// legalization, types like i1024 can occur and i8 doesn't have enough bits - /// to represent the shift amount. - /// When the 1st operand is a vector, the shift amount must be in the same - /// type. (TLI.getShiftAmountTy() will return the same type when the input - /// type is a vector.) - /// For rotates and funnel shifts, the shift amount is treated as an unsigned - /// amount modulo the element size of the first operand. - /// - /// Funnel 'double' shifts take 3 operands, 2 inputs and the shift amount. - /// fshl(X,Y,Z): (X << (Z % BW)) \| (Y >> (BW - (Z % BW))) - /// fshr(X,Y,Z): (X << (BW - (Z % BW))) \| (Y >> (Z % BW)) - SHL, SRA, SRL, ROTL, ROTR, FSHL, FSHR, - VP_SHL, VP_SRA, VP_SRL, - - /// Byte Swap and Counting operators. - BSWAP, CTTZ, CTLZ, CTPOP, BITREVERSE, - - /// Bit counting operators with an undefined result for zero inputs. - CTTZ_ZERO_UNDEF, CTLZ_ZERO_UNDEF, - - /// Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not - /// i1 then the high bits must conform to getBooleanContents. - SELECT, - - /// Select with a vector condition (op #0) and two vector operands (ops #1 - /// and #2), returning a vector result. All vectors have the same length. - /// Much like the scalar select and setcc, each bit in the condition selects - /// whether the corresponding result element is taken from op #1 or op #2. - /// At first, the VSELECT condition is of vXi1 type. Later, targets may - /// change the condition type in order to match the VSELECT node using a - /// pattern. The condition follows the BooleanContent format of the target. - VSELECT, - VP_SELECT, - - /// Select with an integer pivot (op #0) and two vector operands (ops #1 - /// and #2), returning a vector result. Op #3 is the vector length, all - /// vectors have the same length. - /// Vector element below the pivot (op #0) are taken from op #1, elements - /// at positions greater-equal than the pivot are taken from op #2. - VP_COMPOSE, - - /// Select with condition operator - This selects between a true value and - /// a false value (ops #2 and #3) based on the boolean result of comparing - /// the lhs and rhs (ops #0 and #1) of a conditional expression with the - /// condition code in op #4, a CondCodeSDNode. - SELECT_CC, - - /// SetCC operator - This evaluates to a true value iff the condition is - /// true. If the result value type is not i1 then the high bits conform - /// to getBooleanContents. The operands to this are the left and right - /// operands to compare (ops #0, and #1) and the condition code to compare - /// them with (op #2) as a CondCodeSDNode. If the operands are vector types - /// then the result type must also be a vector type. - SETCC, - VP_SETCC, - - /// Like SetCC, ops #0 and #1 are the LHS and RHS operands to compare, but - /// op #2 is a boolean indicating if there is an incoming carry. This - /// operator checks the result of "LHS - RHS - Carry", and can be used to - /// compare two wide integers: - /// (setcccarry lhshi rhshi (subcarry lhslo rhslo) cc). - /// Only valid for integers. - SETCCCARRY, - - /// SHL_PARTS/SRA_PARTS/SRL_PARTS - These operators are used for expanded - /// integer shift operations. The operation ordering is: - /// [Lo,Hi] = op [LoLHS,HiLHS], Amt - SHL_PARTS, SRA_PARTS, SRL_PARTS, - - /// Conversion operators. These are all single input single output - /// operations. For all of these, the result type must be strictly - /// wider or narrower (depending on the operation) than the source - /// type. - - /// SIGN_EXTEND - Used for integer types, replicating the sign bit - /// into new bits. - SIGN_EXTEND, - - /// ZERO_EXTEND - Used for integer types, zeroing the new bits. - ZERO_EXTEND, - - /// ANY_EXTEND - Used for integer types. The high bits are undefined. - ANY_EXTEND, - - /// TRUNCATE - Completely drop the high bits. - TRUNCATE, - - /// [SU]INT_TO_FP - These operators convert integers (whose interpreted sign - /// depends on the first letter) to floating point. - SINT_TO_FP, - UINT_TO_FP, - VP_SINT_TO_FP, - VP_UINT_TO_FP, - - /// SIGN_EXTEND_INREG - This operator atomically performs a SHL/SRA pair to - /// sign extend a small value in a large integer register (e.g. sign - /// extending the low 8 bits of a 32-bit register to fill the top 24 bits - /// with the 7th bit). The size of the smaller type is indicated by the 1th - /// operand, a ValueType node. - SIGN_EXTEND_INREG, - - /// ANY_EXTEND_VECTOR_INREG(Vector) - This operator represents an - /// in-register any-extension of the low lanes of an integer vector. The - /// result type must have fewer elements than the operand type, and those - /// elements must be larger integer types such that the total size of the - /// operand type is less than or equal to the size of the result type. Each - /// of the low operand elements is any-extended into the corresponding, - /// wider result elements with the high bits becoming undef. - /// NOTE: The type legalizer prefers to make the operand and result size - /// the same to allow expansion to shuffle vector during op legalization. - ANY_EXTEND_VECTOR_INREG, - - /// SIGN_EXTEND_VECTOR_INREG(Vector) - This operator represents an - /// in-register sign-extension of the low lanes of an integer vector. The - /// result type must have fewer elements than the operand type, and those - /// elements must be larger integer types such that the total size of the - /// operand type is less than or equal to the size of the result type. Each - /// of the low operand elements is sign-extended into the corresponding, - /// wider result elements. - /// NOTE: The type legalizer prefers to make the operand and result size - /// the same to allow expansion to shuffle vector during op legalization. - SIGN_EXTEND_VECTOR_INREG, - - /// ZERO_EXTEND_VECTOR_INREG(Vector) - This operator represents an - /// in-register zero-extension of the low lanes of an integer vector. The - /// result type must have fewer elements than the operand type, and those - /// elements must be larger integer types such that the total size of the - /// operand type is less than or equal to the size of the result type. Each - /// of the low operand elements is zero-extended into the corresponding, - /// wider result elements. - /// NOTE: The type legalizer prefers to make the operand and result size - /// the same to allow expansion to shuffle vector during op legalization. - ZERO_EXTEND_VECTOR_INREG, - - /// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned - /// integer. These have the same semantics as fptosi and fptoui in IR. If - /// the FP value cannot fit in the integer type, the results are undefined. - FP_TO_SINT, - FP_TO_UINT, - VP_FP_TO_SINT, - VP_FP_TO_UINT, - - /// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type - /// down to the precision of the destination VT. TRUNC is a flag, which is - /// always an integer that is zero or one. If TRUNC is 0, this is a - /// normal rounding, if it is 1, this FP_ROUND is known to not change the - /// value of Y. - /// - /// The TRUNC = 1 case is used in cases where we know that the value will - /// not be modified by the node, because Y is not using any of the extra - /// precision of source type. This allows certain transformations like - /// FP_EXTEND(FP_ROUND(X,1)) -> X which are not safe for - /// FP_EXTEND(FP_ROUND(X,0)) because the extra bits aren't removed. - FP_ROUND, - - /// FLT_ROUNDS_ - Returns current rounding mode: - /// -1 Undefined - /// 0 Round to 0 - /// 1 Round to nearest - /// 2 Round to +inf - /// 3 Round to -inf - FLT_ROUNDS_, - - /// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type. - FP_EXTEND, - VP_FP_EXTEND, - - /// BITCAST - This operator converts between integer, vector and FP - /// values, as if the value was stored to memory with one type and loaded - /// from the same address with the other type (or equivalently for vector - /// format conversions, etc). The source and result are required to have - /// the same bit size (e.g. f32 <-> i32). This can also be used for - /// int-to-int or fp-to-fp conversions, but that is a noop, deleted by - /// getNode(). - /// - /// This operator is subtly different from the bitcast instruction from - /// LLVM-IR since this node may change the bits in the register. For - /// example, this occurs on big-endian NEON and big-endian MSA where the - /// layout of the bits in the register depends on the vector type and this - /// operator acts as a shuffle operation for some vector type combinations. - BITCAST, - - /// ADDRSPACECAST - This operator converts between pointers of different - /// address spaces. - ADDRSPACECAST, - - /// FP16_TO_FP, FP_TO_FP16 - These operators are used to perform promotions - /// and truncation for half-precision (16 bit) floating numbers. These nodes - /// form a semi-softened interface for dealing with f16 (as an i16), which - /// is often a storage-only type but has native conversions. - FP16_TO_FP, FP_TO_FP16, - STRICT_FP16_TO_FP, STRICT_FP_TO_FP16, - - /// Perform various unary floating-point operations inspired by libm. For - /// FPOWI, the result is undefined if if the integer operand doesn't fit - /// into 32 bits. - FNEG, FABS, FSQRT, FCBRT, FSIN, FCOS, FPOWI, FPOW, - FLOG, FLOG2, FLOG10, FEXP, FEXP2, - FCEIL, FTRUNC, FRINT, FNEARBYINT, FROUND, FFLOOR, - LROUND, LLROUND, LRINT, LLRINT, - - VP_FNEG, VP_FABS, VP_FSQRT, VP_FCBRT, VP_FSIN, VP_FCOS, VP_FPOWI, VP_FPOW, - VP_FLOG, VP_FLOG2, VP_FLOG10, VP_FEXP, VP_FEXP2, - VP_FCEIL, VP_FTRUNC, VP_FRINT, VP_FNEARBYINT, VP_FROUND, VP_FFLOOR, - VP_LROUND, VP_LLROUND, VP_LRINT, VP_LLRINT, - /// FMINNUM/FMAXNUM - Perform floating-point minimum or maximum on two - /// values. - // - /// In the case where a single input is a NaN (either signaling or quiet), - /// the non-NaN input is returned. - /// - /// The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0. - FMINNUM, FMAXNUM, - VP_FMINNUM, VP_FMAXNUM, - - /// FMINNUM_IEEE/FMAXNUM_IEEE - Perform floating-point minimum or maximum on - /// two values, following the IEEE-754 2008 definition. This differs from - /// FMINNUM/FMAXNUM in the handling of signaling NaNs. If one input is a - /// signaling NaN, returns a quiet NaN. - FMINNUM_IEEE, FMAXNUM_IEEE, - - /// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0 - /// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008 - /// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics. - FMINIMUM, FMAXIMUM, - - /// FSINCOS - Compute both fsin and fcos as a single operation. - FSINCOS, - - /// LOAD and STORE have token chains as their first operand, then the same - /// operands as an LLVM load/store instruction, then an offset node that - /// is added / subtracted from the base pointer to form the address (for - /// indexed memory ops). - LOAD, STORE, - - /// DYNAMIC_STACKALLOC - Allocate some number of bytes on the stack aligned - /// to a specified boundary. This node always has two return values: a new - /// stack pointer value and a chain. The first operand is the token chain, - /// the second is the number of bytes to allocate, and the third is the - /// alignment boundary. The size is guaranteed to be a multiple of the - /// stack alignment, and the alignment is guaranteed to be bigger than the - /// stack alignment (if required) or 0 to get standard stack alignment. - DYNAMIC_STACKALLOC, - - /// Control flow instructions. These all have token chains. - - /// BR - Unconditional branch. The first operand is the chain - /// operand, the second is the MBB to branch to. - BR, - - /// BRIND - Indirect branch. The first operand is the chain, the second - /// is the value to branch to, which must be of the same type as the - /// target's pointer type. - BRIND, - - /// BR_JT - Jumptable branch. The first operand is the chain, the second - /// is the jumptable index, the last one is the jumptable entry index. - BR_JT, - - /// BRCOND - Conditional branch. The first operand is the chain, the - /// second is the condition, the third is the block to branch to if the - /// condition is true. If the type of the condition is not i1, then the - /// high bits must conform to getBooleanContents. - BRCOND, - - /// BR_CC - Conditional branch. The behavior is like that of SELECT_CC, in - /// that the condition is represented as condition code, and two nodes to - /// compare, rather than as a combined SetCC node. The operands in order - /// are chain, cc, lhs, rhs, block to branch to if condition is true. - BR_CC, - - /// INLINEASM - Represents an inline asm block. This node always has two - /// return values: a chain and a flag result. The inputs are as follows: - /// Operand #0 : Input chain. - /// Operand #1 : a ExternalSymbolSDNode with a pointer to the asm string. - /// Operand #2 : a MDNodeSDNode with the !srcloc metadata. - /// Operand #3 : HasSideEffect, IsAlignStack bits. - /// After this, it is followed by a list of operands with this format: - /// ConstantSDNode: Flags that encode whether it is a mem or not, the - /// of operands that follow, etc. See InlineAsm.h. - /// ... however many operands ... - /// Operand #last: Optional, an incoming flag. - /// - /// The variable width operands are required to represent target addressing - /// modes as a single "operand", even though they may have multiple - /// SDOperands. - INLINEASM, - - /// INLINEASM_BR - Terminator version of inline asm. Used by asm-goto. - INLINEASM_BR, - - /// EH_LABEL - Represents a label in mid basic block used to track - /// locations needed for debug and exception handling tables. These nodes - /// take a chain as input and return a chain. - EH_LABEL, - - /// ANNOTATION_LABEL - Represents a mid basic block label used by - /// annotations. This should remain within the basic block and be ordered - /// with respect to other call instructions, but loads and stores may float - /// past it. - ANNOTATION_LABEL, - - /// CATCHRET - Represents a return from a catch block funclet. Used for - /// MSVC compatible exception handling. Takes a chain operand and a - /// destination basic block operand. - CATCHRET, - - /// CLEANUPRET - Represents a return from a cleanup block funclet. Used for - /// MSVC compatible exception handling. Takes only a chain operand. - CLEANUPRET, - - /// STACKSAVE - STACKSAVE has one operand, an input chain. It produces a - /// value, the same type as the pointer type for the system, and an output - /// chain. - STACKSAVE, - - /// STACKRESTORE has two operands, an input chain and a pointer to restore - /// to it returns an output chain. - STACKRESTORE, - - /// CALLSEQ_START/CALLSEQ_END - These operators mark the beginning and end - /// of a call sequence, and carry arbitrary information that target might - /// want to know. The first operand is a chain, the rest are specified by - /// the target and not touched by the DAG optimizers. - /// Targets that may use stack to pass call arguments define additional - /// operands: - /// - size of the call frame part that must be set up within the - /// CALLSEQ_START..CALLSEQ_END pair, - /// - part of the call frame prepared prior to CALLSEQ_START. - /// Both these parameters must be constants, their sum is the total call - /// frame size. - /// CALLSEQ_START..CALLSEQ_END pairs may not be nested. - CALLSEQ_START, // Beginning of a call sequence - CALLSEQ_END, // End of a call sequence - - /// VAARG - VAARG has four operands: an input chain, a pointer, a SRCVALUE, - /// and the alignment. It returns a pair of values: the vaarg value and a - /// new chain. - VAARG, - - /// VACOPY - VACOPY has 5 operands: an input chain, a destination pointer, - /// a source pointer, a SRCVALUE for the destination, and a SRCVALUE for the - /// source. - VACOPY, - - /// VAEND, VASTART - VAEND and VASTART have three operands: an input chain, - /// pointer, and a SRCVALUE. - VAEND, VASTART, - - /// SRCVALUE - This is a node type that holds a Value that is used to - /// make reference to a value in the LLVM IR. - SRCVALUE, - - /// MDNODE_SDNODE - This is a node that holdes an MDNode, which is used to - /// reference metadata in the IR. - MDNODE_SDNODE, - - /// PCMARKER - This corresponds to the pcmarker intrinsic. - PCMARKER, - - /// READCYCLECOUNTER - This corresponds to the readcyclecounter intrinsic. - /// It produces a chain and one i64 value. The only operand is a chain. - /// If i64 is not legal, the result will be expanded into smaller values. - /// Still, it returns an i64, so targets should set legality for i64. - /// The result is the content of the architecture-specific cycle - /// counter-like register (or other high accuracy low latency clock source). - READCYCLECOUNTER, - - /// HANDLENODE node - Used as a handle for various purposes. - HANDLENODE, - - /// INIT_TRAMPOLINE - This corresponds to the init_trampoline intrinsic. It - /// takes as input a token chain, the pointer to the trampoline, the pointer - /// to the nested function, the pointer to pass for the 'nest' parameter, a - /// SRCVALUE for the trampoline and another for the nested function - /// (allowing targets to access the original Function). - /// It produces a token chain as output. - INIT_TRAMPOLINE, - - /// ADJUST_TRAMPOLINE - This corresponds to the adjust_trampoline intrinsic. - /// It takes a pointer to the trampoline and produces a (possibly) new - /// pointer to the same trampoline with platform-specific adjustments - /// applied. The pointer it returns points to an executable block of code. - ADJUST_TRAMPOLINE, - - /// TRAP - Trapping instruction - TRAP, - - /// DEBUGTRAP - Trap intended to get the attention of a debugger. - DEBUGTRAP, - - /// PREFETCH - This corresponds to a prefetch intrinsic. The first operand - /// is the chain. The other operands are the address to prefetch, - /// read / write specifier, locality specifier and instruction / data cache - /// specifier. - PREFETCH, - - /// OUTCHAIN = ATOMIC_FENCE(INCHAIN, ordering, scope) - /// This corresponds to the fence instruction. It takes an input chain, and - /// two integer constants: an AtomicOrdering and a SynchronizationScope. - ATOMIC_FENCE, - - /// Val, OUTCHAIN = ATOMIC_LOAD(INCHAIN, ptr) - /// This corresponds to "load atomic" instruction. - ATOMIC_LOAD, - - /// OUTCHAIN = ATOMIC_STORE(INCHAIN, ptr, val) - /// This corresponds to "store atomic" instruction. - ATOMIC_STORE, - - /// Val, OUTCHAIN = ATOMIC_CMP_SWAP(INCHAIN, ptr, cmp, swap) - /// For double-word atomic operations: - /// ValLo, ValHi, OUTCHAIN = ATOMIC_CMP_SWAP(INCHAIN, ptr, cmpLo, cmpHi, - /// swapLo, swapHi) - /// This corresponds to the cmpxchg instruction. - ATOMIC_CMP_SWAP, - - /// Val, Success, OUTCHAIN - /// = ATOMIC_CMP_SWAP_WITH_SUCCESS(INCHAIN, ptr, cmp, swap) - /// N.b. this is still a strong cmpxchg operation, so - /// Success == "Val == cmp". - ATOMIC_CMP_SWAP_WITH_SUCCESS, - - /// Val, OUTCHAIN = ATOMIC_SWAP(INCHAIN, ptr, amt) - /// Val, OUTCHAIN = ATOMIC_LOAD_[OpName](INCHAIN, ptr, amt) - /// For double-word atomic operations: - /// ValLo, ValHi, OUTCHAIN = ATOMIC_SWAP(INCHAIN, ptr, amtLo, amtHi) - /// ValLo, ValHi, OUTCHAIN = ATOMIC_LOAD_[OpName](INCHAIN, ptr, amtLo, amtHi) - /// These correspond to the atomicrmw instruction. - ATOMIC_SWAP, - ATOMIC_LOAD_ADD, - ATOMIC_LOAD_SUB, - ATOMIC_LOAD_AND, - ATOMIC_LOAD_CLR, - ATOMIC_LOAD_OR, - ATOMIC_LOAD_XOR, - ATOMIC_LOAD_NAND, - ATOMIC_LOAD_MIN, - ATOMIC_LOAD_MAX, - ATOMIC_LOAD_UMIN, - ATOMIC_LOAD_UMAX, - ATOMIC_LOAD_FADD, - ATOMIC_LOAD_FSUB, - - // Masked load and store - consecutive vector load and store operations - // with additional mask operand that prevents memory accesses to the - // masked-off lanes. - // - // Val, OutChain = MLOAD(BasePtr, Mask, PassThru) - // OutChain = MSTORE(Value, BasePtr, Mask) - MLOAD, MSTORE, - VP_LOAD, VP_STORE, - - // Masked gather and scatter - load and store operations for a vector of - // random addresses with additional mask operand that prevents memory - // accesses to the masked-off lanes. - // - // Val, OutChain = GATHER(InChain, PassThru, Mask, BasePtr, Index, Scale) - // OutChain = SCATTER(InChain, Value, Mask, BasePtr, Index, Scale) - // - // The Index operand can have more vector elements than the other operands - // due to type legalization. The extra elements are ignored. - MGATHER, MSCATTER, - - // VP gather and scatter - load and store operations for a vector of - // random addresses with additional mask and vector length operand that - // prevents memory accesses to the masked-off lanes. - // - // Val, OutChain = VP_GATHER(InChain, BasePtr, Index, Scale, Mask, EVL) - // OutChain = VP_SCATTER(InChain, Value, BasePtr, Index, Scale, Mask, EVL) - // - // The Index operand can have more vector elements than the other operands - // due to type legalization. The extra elements are ignored. - VP_GATHER, VP_SCATTER, - - /// This corresponds to the llvm.lifetime.* intrinsics. The first operand - /// is the chain and the second operand is the alloca pointer. - LIFETIME_START, LIFETIME_END, - - /// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the - /// beginning and end of GC transition sequence, and carry arbitrary - /// information that target might need for lowering. The first operand is - /// a chain, the rest are specified by the target and not touched by the DAG - /// optimizers. GC_TRANSITION_START..GC_TRANSITION_END pairs may not be - /// nested. - GC_TRANSITION_START, - GC_TRANSITION_END, - - /// GET_DYNAMIC_AREA_OFFSET - get offset from native SP to the address of - /// the most recent dynamic alloca. For most targets that would be 0, but - /// for some others (e.g. PowerPC, PowerPC64) that would be compile-time - /// known nonzero constant. The only operand here is the chain. - GET_DYNAMIC_AREA_OFFSET, - - /// VSCALE(IMM) - Returns the runtime scaling factor used to calculate the - /// number of elements within a scalable vector. IMM is a constant integer - /// multiplier that is applied to the runtime value. - VSCALE, - - /// Generic reduction nodes. These nodes represent horizontal vector - /// reduction operations, producing a scalar result. - /// The STRICT variants perform reductions in sequential order. The first - /// operand is an initial scalar accumulator value, and the second operand - /// is the vector to reduce. - VECREDUCE_STRICT_FADD, VECREDUCE_STRICT_FMUL, - /// These reductions are non-strict, and have a single vector operand. - VECREDUCE_FADD, VECREDUCE_FMUL, - /// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants. - VECREDUCE_FMAX, VECREDUCE_FMIN, - /// Integer reductions may have a result type larger than the vector element - /// type. However, the reduction is performed using the vector element type - /// and the value in the top bits is unspecified. - VECREDUCE_ADD, VECREDUCE_MUL, - VECREDUCE_AND, VECREDUCE_OR, VECREDUCE_XOR, - VECREDUCE_SMAX, VECREDUCE_SMIN, VECREDUCE_UMAX, VECREDUCE_UMIN, - - VP_REDUCE_FADD, VP_REDUCE_FMUL, - VP_REDUCE_ADD, VP_REDUCE_MUL, - VP_REDUCE_AND, VP_REDUCE_OR, VP_REDUCE_XOR, - VP_REDUCE_SMAX, VP_REDUCE_SMIN, VP_REDUCE_UMAX, VP_REDUCE_UMIN, - - /// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants. - VP_REDUCE_FMAX, VP_REDUCE_FMIN, - - /// BUILTIN_OP_END - This must be the last enum value in this list. - /// The target-specific pre-isel opcode values start here. - BUILTIN_OP_END - }; - - /// FIRST_TARGET_STRICTFP_OPCODE - Target-specific pre-isel operations - /// which cannot raise FP exceptions should be less than this value. - /// Those that do must not be less than this value. - static const int FIRST_TARGET_STRICTFP_OPCODE = BUILTIN_OP_END+400; - - /// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations - /// which do not reference a specific memory location should be less than - /// this value. Those that do must not be less than this value, and can - /// be used with SelectionDAG::getMemIntrinsicNode. - static const int FIRST_TARGET_MEMORY_OPCODE = BUILTIN_OP_END+500; - - //===--------------------------------------------------------------------===// - /// MemIndexedMode enum - This enum defines the load / store indexed - /// addressing modes. +enum NodeType { + /// DELETED_NODE - This is an illegal value that is used to catch + /// errors. This opcode is not a legal opcode for any node. + DELETED_NODE, + + /// EntryToken - This is the marker used to indicate the start of a region. + EntryToken, + + /// TokenFactor - This node takes multiple tokens as input and produces a + /// single token result. This is used to represent the fact that the operand + /// operators are independent of each other. + TokenFactor, + + /// AssertSext, AssertZext - These nodes record if a register contains a + /// value that has already been zero or sign extended from a narrower type. + /// These nodes take two operands. The first is the node that has already + /// been extended, and the second is a value type node indicating the width + /// of the extension + AssertSext, + AssertZext, + + /// Various leaf nodes. + BasicBlock, + VALUETYPE, + CONDCODE, + Register, + RegisterMask, + Constant, + ConstantFP, + GlobalAddress, + GlobalTLSAddress, + FrameIndex, + JumpTable, + ConstantPool, + ExternalSymbol, + BlockAddress, + + /// The address of the GOT + GLOBAL_OFFSET_TABLE, + + /// FRAMEADDR, RETURNADDR - These nodes represent llvm.frameaddress and + /// llvm.returnaddress on the DAG. These nodes take one operand, the index + /// of the frame or return address to return. An index of zero corresponds + /// to the current function's frame or return address, an index of one to + /// the parent's frame or return address, and so on. + FRAMEADDR, + RETURNADDR, + ADDROFRETURNADDR, + SPONENTRY, + + /// LOCAL_RECOVER - Represents the llvm.localrecover intrinsic. + /// Materializes the offset from the local object pointer of another + /// function to a particular local object passed to llvm.localescape. The + /// operand is the MCSymbol label used to represent this offset, since + /// typically the offset is not known until after code generation of the + /// parent. + LOCAL_RECOVER, + + /// READ_REGISTER, WRITE_REGISTER - This node represents llvm.register on + /// the DAG, which implements the named register global variables extension. + READ_REGISTER, + WRITE_REGISTER, + + /// FRAME_TO_ARGS_OFFSET - This node represents offset from frame pointer to + /// first (possible) on-stack argument. This is needed for correct stack + /// adjustment during unwind. + FRAME_TO_ARGS_OFFSET, + + /// EH_DWARF_CFA - This node represents the pointer to the DWARF Canonical + /// Frame Address (CFA), generally the value of the stack pointer at the + /// call site in the previous frame. + EH_DWARF_CFA, + + /// OUTCHAIN = EH_RETURN(INCHAIN, OFFSET, HANDLER) - This node represents + /// 'eh_return' gcc dwarf builtin, which is used to return from + /// exception. The general meaning is: adjust stack by OFFSET and pass + /// execution to HANDLER. Many platform-related details also :) + EH_RETURN, + + /// RESULT, OUTCHAIN = EH_SJLJ_SETJMP(INCHAIN, buffer) + /// This corresponds to the eh.sjlj.setjmp intrinsic. + /// It takes an input chain and a pointer to the jump buffer as inputs + /// and returns an outchain. + EH_SJLJ_SETJMP, + + /// OUTCHAIN = EH_SJLJ_LONGJMP(INCHAIN, buffer) + /// This corresponds to the eh.sjlj.longjmp intrinsic. + /// It takes an input chain and a pointer to the jump buffer as inputs + /// and returns an outchain. + EH_SJLJ_LONGJMP, + + /// OUTCHAIN = EH_SJLJ_SETUP_DISPATCH(INCHAIN) + /// The target initializes the dispatch table here. + EH_SJLJ_SETUP_DISPATCH, + + /// TargetConstant* - Like Constant, but the DAG does not do any folding, + /// simplification, or lowering of the constant. They are used for constants + /// which are known to fit in the immediate fields of their users, or for + /// carrying magic numbers which are not values which need to be + /// materialized in registers. + TargetConstant, + TargetConstantFP, + + /// TargetGlobalAddress - Like GlobalAddress, but the DAG does no folding or + /// anything else with this node, and this is valid in the target-specific + /// dag, turning into a GlobalAddress operand. + TargetGlobalAddress, + TargetGlobalTLSAddress, + TargetFrameIndex, + TargetJumpTable, + TargetConstantPool, + TargetExternalSymbol, + TargetBlockAddress, + + MCSymbol, + + /// TargetIndex - Like a constant pool entry, but with completely + /// target-dependent semantics. Holds target flags, a 32-bit index, and a + /// 64-bit index. Targets can use this however they like. + TargetIndex, + + /// RESULT = INTRINSIC_WO_CHAIN(INTRINSICID, arg1, arg2, ...) + /// This node represents a target intrinsic function with no side effects. + /// The first operand is the ID number of the intrinsic from the + /// llvm::Intrinsic namespace. The operands to the intrinsic follow. The + /// node returns the result of the intrinsic. + INTRINSIC_WO_CHAIN, + + /// RESULT,OUTCHAIN = INTRINSIC_W_CHAIN(INCHAIN, INTRINSICID, arg1, ...) + /// This node represents a target intrinsic function with side effects that + /// returns a result. The first operand is a chain pointer. The second is + /// the ID number of the intrinsic from the llvm::Intrinsic namespace. The + /// operands to the intrinsic follow. The node has two results, the result + /// of the intrinsic and an output chain. + INTRINSIC_W_CHAIN, + + /// OUTCHAIN = INTRINSIC_VOID(INCHAIN, INTRINSICID, arg1, arg2, ...) + /// This node represents a target intrinsic function with side effects that + /// does not return a result. The first operand is a chain pointer. The + /// second is the ID number of the intrinsic from the llvm::Intrinsic + /// namespace. The operands to the intrinsic follow. + INTRINSIC_VOID, + + /// CopyToReg - This node has three operands: a chain, a register number to + /// set to this value, and a value. + CopyToReg, + + /// CopyFromReg - This node indicates that the input value is a virtual or + /// physical register that is defined outside of the scope of this + /// SelectionDAG. The register is available from the RegisterSDNode object. + CopyFromReg, + + /// UNDEF - An undefined node. + UNDEF, + + /// EXTRACT_ELEMENT - This is used to get the lower or upper (determined by + /// a Constant, which is required to be operand #1) half of the integer or + /// float value specified as operand #0. This is only for use before + /// legalization, for values that will be broken into multiple registers. + EXTRACT_ELEMENT, + + /// BUILD_PAIR - This is the opposite of EXTRACT_ELEMENT in some ways. + /// Given two values of the same integer value type, this produces a value + /// twice as big. Like EXTRACT_ELEMENT, this can only be used before + /// legalization. The lower part of the composite value should be in + /// element 0 and the upper part should be in element 1. + BUILD_PAIR, + + /// MERGE_VALUES - This node takes multiple discrete operands and returns + /// them all as its individual results. This nodes has exactly the same + /// number of inputs and outputs. This node is useful for some pieces of the + /// code generator that want to think about a single node with multiple + /// results, not multiple nodes. + MERGE_VALUES, + + /// Simple integer binary arithmetic operators. + ADD, + SUB, + MUL, + SDIV, + UDIV, + SREM, + UREM, + VP_ADD, + VP_SUB, + VP_MUL, + VP_SDIV, + VP_UDIV, + VP_SREM, + VP_UREM, + + /// SMUL_LOHI/UMUL_LOHI - Multiply two integers of type iN, producing + /// a signed/unsigned value of type i[2N], and return the full value as + /// two results, each of type iN. + SMUL_LOHI, + UMUL_LOHI, + + /// SDIVREM/UDIVREM - Divide two integers and produce both a quotient and + /// remainder result. + SDIVREM, + UDIVREM, + + /// CARRY_FALSE - This node is used when folding other nodes, + /// like ADDC/SUBC, which indicate the carry result is always false. + CARRY_FALSE, + + /// Carry-setting nodes for multiple precision addition and subtraction. + /// These nodes take two operands of the same value type, and produce two + /// results. The first result is the normal add or sub result, the second + /// result is the carry flag result. + /// FIXME: These nodes are deprecated in favor of ADDCARRY and SUBCARRY. + /// They are kept around for now to provide a smooth transition path + /// toward the use of ADDCARRY/SUBCARRY and will eventually be removed. + ADDC, + SUBC, + + /// Carry-using nodes for multiple precision addition and subtraction. These + /// nodes take three operands: The first two are the normal lhs and rhs to + /// the add or sub, and the third is the input carry flag. These nodes + /// produce two results; the normal result of the add or sub, and the output + /// carry flag. These nodes both read and write a carry flag to allow them + /// to them to be chained together for add and sub of arbitrarily large + /// values. + ADDE, + SUBE, + + /// Carry-using nodes for multiple precision addition and subtraction. + /// These nodes take three operands: The first two are the normal lhs and + /// rhs to the add or sub, and the third is a boolean indicating if there + /// is an incoming carry. These nodes produce two results: the normal + /// result of the add or sub, and the output carry so they can be chained + /// together. The use of this opcode is preferable to adde/sube if the + /// target supports it, as the carry is a regular value rather than a + /// glue, which allows further optimisation. + ADDCARRY, + SUBCARRY, + + /// RESULT, BOOL = [SU]ADDO(LHS, RHS) - Overflow-aware nodes for addition. + /// These nodes take two operands: the normal LHS and RHS to the add. They + /// produce two results: the normal result of the add, and a boolean that + /// indicates if an overflow occurred (not a flag, because it may be store + /// to memory, etc.). If the type of the boolean is not i1 then the high + /// bits conform to getBooleanContents. + /// These nodes are generated from llvm.[su]add.with.overflow intrinsics. + SADDO, + UADDO, + + /// Same for subtraction. + SSUBO, + USUBO, + + /// Same for multiplication. + SMULO, + UMULO, + + /// RESULT = [US]ADDSAT(LHS, RHS) - Perform saturation addition on 2 + /// integers with the same bit width (W). If the true value of LHS + RHS + /// exceeds the largest value that can be represented by W bits, the + /// resulting value is this maximum value. Otherwise, if this value is less + /// than the smallest value that can be represented by W bits, the + /// resulting value is this minimum value. + SADDSAT, + UADDSAT, + + /// RESULT = [US]SUBSAT(LHS, RHS) - Perform saturation subtraction on 2 + /// integers with the same bit width (W). If the true value of LHS - RHS + /// exceeds the largest value that can be represented by W bits, the + /// resulting value is this maximum value. Otherwise, if this value is less + /// than the smallest value that can be represented by W bits, the + /// resulting value is this minimum value. + SSUBSAT, + USUBSAT, + + /// RESULT = [US]MULFIX(LHS, RHS, SCALE) - Perform fixed point multiplication + /// on + /// 2 integers with the same width and scale. SCALE represents the scale of + /// both operands as fixed point numbers. This SCALE parameter must be a + /// constant integer. A scale of zero is effectively performing + /// multiplication on 2 integers. + SMULFIX, + UMULFIX, + + /// Same as the corresponding unsaturated fixed point instructions, but the + /// result is clamped between the min and max values representable by the + /// bits of the first 2 operands. + SMULFIXSAT, + UMULFIXSAT, + + /// RESULT = [US]DIVFIX(LHS, RHS, SCALE) - Perform fixed point division on + /// 2 integers with the same width and scale. SCALE represents the scale + /// of both operands as fixed point numbers. This SCALE parameter must be a + /// constant integer. + SDIVFIX, + UDIVFIX, + + /// Same as the corresponding unsaturated fixed point instructions, but the + /// result is clamped between the min and max values representable by the + /// bits of the first 2 operands. + SDIVFIXSAT, + UDIVFIXSAT, + + /// Simple binary floating point operators. + FADD, + FSUB, + FMUL, + FDIV, + FREM, + VP_FADD, + VP_FSUB, + VP_FMUL, + VP_FDIV, + VP_FREM, + + /// Constrained versions of the binary floating point operators. + /// These will be lowered to the simple operators before final selection. + /// They are used to limit optimizations while the DAG is being + /// optimized. + STRICT_FADD, + STRICT_FSUB, + STRICT_FMUL, + STRICT_FDIV, + STRICT_FREM, + STRICT_FMA, + + /// Constrained versions of libm-equivalent floating point intrinsics. + /// These will be lowered to the equivalent non-constrained pseudo-op + /// (or expanded to the equivalent library call) before final selection. + /// They are used to limit optimizations while the DAG is being optimized. + STRICT_FSQRT, + STRICT_FPOW, + STRICT_FPOWI, + STRICT_FSIN, + STRICT_FCOS, + STRICT_FEXP, + STRICT_FEXP2, + STRICT_FLOG, + STRICT_FLOG10, + STRICT_FLOG2, + STRICT_FRINT, + STRICT_FNEARBYINT, + STRICT_FMAXNUM, + STRICT_FMINNUM, + STRICT_FCEIL, + STRICT_FFLOOR, + STRICT_FROUND, + STRICT_FTRUNC, + STRICT_LROUND, + STRICT_LLROUND, + STRICT_LRINT, + STRICT_LLRINT, + STRICT_FMAXIMUM, + STRICT_FMINIMUM, + + /// STRICT_FP_TO_[US]INT - Convert a floating point value to a signed or + /// unsigned integer. These have the same semantics as fptosi and fptoui + /// in IR. + /// They are used to limit optimizations while the DAG is being optimized. + STRICT_FP_TO_SINT, + STRICT_FP_TO_UINT, + + /// STRICT_[US]INT_TO_FP - Convert a signed or unsigned integer to + /// a floating point value. These have the same semantics as sitofp and + /// uitofp in IR. + /// They are used to limit optimizations while the DAG is being optimized. + STRICT_SINT_TO_FP, + STRICT_UINT_TO_FP, + + /// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating + /// point type down to the precision of the destination VT. TRUNC is a + /// flag, which is always an integer that is zero or one. If TRUNC is 0, + /// this is a normal rounding, if it is 1, this FP_ROUND is known to not + /// change the value of Y. Lint: Pre-merge checks: clang-format: please reformat the code ``` - enum NodeType { - /// DELETED_NODE - This is…
/// DELETED_NODE - This is an illegal value that is used to catch		/// DELETED_NODE - This is an illegal value that is used to catch
/// errors. This opcode is not a legal opcode for any node.		/// errors. This opcode is not a legal opcode for any node.
DELETED_NODE,		DELETED_NODE,

/// EntryToken - This is the marker used to indicate the start of a region.		/// EntryToken - This is the marker used to indicate the start of a region.
EntryToken,		EntryToken,

/// TokenFactor - This node takes multiple tokens as input and produces a		/// TokenFactor - This node takes multiple tokens as input and produces a
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	enum NodeType {
/// them all as its individual results. This nodes has exactly the same		/// them all as its individual results. This nodes has exactly the same
/// number of inputs and outputs. This node is useful for some pieces of the		/// number of inputs and outputs. This node is useful for some pieces of the
/// code generator that want to think about a single node with multiple		/// code generator that want to think about a single node with multiple
/// results, not multiple nodes.		/// results, not multiple nodes.
MERGE_VALUES,		MERGE_VALUES,

/// Simple integer binary arithmetic operators.		/// Simple integer binary arithmetic operators.
ADD, SUB, MUL, SDIV, UDIV, SREM, UREM,		ADD, SUB, MUL, SDIV, UDIV, SREM, UREM,
		VP_ADD, VP_SUB, VP_MUL, VP_SDIV, VP_UDIV, VP_SREM, VP_UREM,

/// SMUL_LOHI/UMUL_LOHI - Multiply two integers of type iN, producing		/// SMUL_LOHI/UMUL_LOHI - Multiply two integers of type iN, producing
/// a signed/unsigned value of type i[2*N], and return the full value as		/// a signed/unsigned value of type i[2*N], and return the full value as
/// two results, each of type iN.		/// two results, each of type iN.
SMUL_LOHI, UMUL_LOHI,		SMUL_LOHI, UMUL_LOHI,

/// SDIVREM/UDIVREM - Divide two integers and produce both a quotient and		/// SDIVREM/UDIVREM - Divide two integers and produce both a quotient and
/// remainder result.		/// remainder result.
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	enum NodeType {

/// Same as the corresponding unsaturated fixed point instructions, but the		/// Same as the corresponding unsaturated fixed point instructions, but the
/// result is clamped between the min and max values representable by the		/// result is clamped between the min and max values representable by the
/// bits of the first 2 operands.		/// bits of the first 2 operands.
SDIVFIXSAT, UDIVFIXSAT,		SDIVFIXSAT, UDIVFIXSAT,

/// Simple binary floating point operators.		/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,		FADD, FSUB, FMUL, FDIV, FREM,
		VP_FADD, VP_FSUB, VP_FMUL, VP_FDIV, VP_FREM,

/// Constrained versions of the binary floating point operators.		/// Constrained versions of the binary floating point operators.
/// These will be lowered to the simple operators before final selection.		/// These will be lowered to the simple operators before final selection.
/// They are used to limit optimizations while the DAG is being		/// They are used to limit optimizations while the DAG is being
/// optimized.		/// optimized.
STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,		STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,
STRICT_FMA,		STRICT_FMA,

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	enum NodeType {
/// STRICT_FSETCC/STRICT_FSETCCS - Constrained versions of SETCC, used		/// STRICT_FSETCC/STRICT_FSETCCS - Constrained versions of SETCC, used
/// for floating-point operands only. STRICT_FSETCC performs a quiet		/// for floating-point operands only. STRICT_FSETCC performs a quiet
/// comparison operation, while STRICT_FSETCCS performs a signaling		/// comparison operation, while STRICT_FSETCCS performs a signaling
/// comparison operation.		/// comparison operation.
STRICT_FSETCC, STRICT_FSETCCS,		STRICT_FSETCC, STRICT_FSETCCS,

/// FMA - Perform a * b + c with no intermediate rounding step.		/// FMA - Perform a * b + c with no intermediate rounding step.
FMA,		FMA,
		VP_FMA,

/// FMAD - Perform a * b + c, while getting the same result as the		/// FMAD - Perform a * b + c, while getting the same result as the
/// separately rounded operations.		/// separately rounded operations.
FMAD,		FMAD,

/// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This		/// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This
/// DAG node does not require that X and Y have the same type, just that		/// DAG node does not require that X and Y have the same type, just that
/// they are both floating point. X and the result must have the same type.		/// they are both floating point. X and the result must have the same type.
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	enum NodeType {
/// VEC1/VEC2. A VECTOR_SHUFFLE node also contains an array of constant int		/// VEC1/VEC2. A VECTOR_SHUFFLE node also contains an array of constant int
/// values that indicate which value (or undef) each result element will		/// values that indicate which value (or undef) each result element will
/// get. These constant ints are accessible through the		/// get. These constant ints are accessible through the
/// ShuffleVectorSDNode class. This is quite similar to the Altivec		/// ShuffleVectorSDNode class. This is quite similar to the Altivec
/// 'vperm' instruction, except that the indices must be constants and are		/// 'vperm' instruction, except that the indices must be constants and are
/// in terms of the element size of VEC1/VEC2, not in terms of bytes.		/// in terms of the element size of VEC1/VEC2, not in terms of bytes.
VECTOR_SHUFFLE,		VECTOR_SHUFFLE,

		/// VP_VSHIFT(VEC1, AMOUNT, MASK, VLEN) - Returns a vector, of the same type as
		/// VEC1. AMOUNT is an integer value. The returned vector is equivalent
		/// to VEC1 shifted by AMOUNT (RETURNED_VEC[idx] = VEC1[idx + AMOUNT]).
		VP_VSHIFT,

		/// VP_COMPRESS(VEC1, MASK, VLEN) - Returns a vector, of the same type as
		/// VEC1.
		VP_COMPRESS,

		/// VP_EXPAND(VEC1, MASK, VLEN) - Returns a vector, of the same type as
		/// VEC1.
		VP_EXPAND,

/// SCALAR_TO_VECTOR(VAL) - This represents the operation of loading a		/// SCALAR_TO_VECTOR(VAL) - This represents the operation of loading a
/// scalar value into element 0 of the resultant vector type. The top		/// scalar value into element 0 of the resultant vector type. The top
/// elements 1 to N-1 of the N-element vector are undefined. The type		/// elements 1 to N-1 of the N-element vector are undefined. The type
/// of the operand must match the vector element type, except when they		/// of the operand must match the vector element type, except when they
/// are integer types. In this case the operand is allowed to be wider		/// are integer types. In this case the operand is allowed to be wider
/// than the vector element type, and is implicitly truncated to it.		/// than the vector element type, and is implicitly truncated to it.
SCALAR_TO_VECTOR,		SCALAR_TO_VECTOR,

Show All 10 Lines	enum NodeType {
MULHU, MULHS,		MULHU, MULHS,

/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned		/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned
/// integers.		/// integers.
SMIN, SMAX, UMIN, UMAX,		SMIN, SMAX, UMIN, UMAX,

/// Bitwise operators - logical and, logical or, logical xor.		/// Bitwise operators - logical and, logical or, logical xor.
AND, OR, XOR,		AND, OR, XOR,
		VP_AND, VP_OR, VP_XOR,

/// ABS - Determine the unsigned absolute value of a signed integer value of		/// ABS - Determine the unsigned absolute value of a signed integer value of
/// the same bitwidth.		/// the same bitwidth.
/// Note: A value of INT_MIN will return INT_MIN, no saturation or overflow		/// Note: A value of INT_MIN will return INT_MIN, no saturation or overflow
/// is performed.		/// is performed.
ABS,		ABS,

/// Shift and rotation operations. After legalization, the type of the		/// Shift and rotation operations. After legalization, the type of the
/// shift amount is known to be TLI.getShiftAmountTy(). Before legalization		/// shift amount is known to be TLI.getShiftAmountTy(). Before legalization
/// the shift amount can be any type, but care must be taken to ensure it is		/// the shift amount can be any type, but care must be taken to ensure it is
/// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before		/// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before
/// legalization, types like i1024 can occur and i8 doesn't have enough bits		/// legalization, types like i1024 can occur and i8 doesn't have enough bits
/// to represent the shift amount.		/// to represent the shift amount.
/// When the 1st operand is a vector, the shift amount must be in the same		/// When the 1st operand is a vector, the shift amount must be in the same
/// type. (TLI.getShiftAmountTy() will return the same type when the input		/// type. (TLI.getShiftAmountTy() will return the same type when the input
/// type is a vector.)		/// type is a vector.)
/// For rotates and funnel shifts, the shift amount is treated as an unsigned		/// For rotates and funnel shifts, the shift amount is treated as an unsigned
/// amount modulo the element size of the first operand.		/// amount modulo the element size of the first operand.
///		///
/// Funnel 'double' shifts take 3 operands, 2 inputs and the shift amount.		/// Funnel 'double' shifts take 3 operands, 2 inputs and the shift amount.
/// fshl(X,Y,Z): (X << (Z % BW)) \| (Y >> (BW - (Z % BW)))		/// fshl(X,Y,Z): (X << (Z % BW)) \| (Y >> (BW - (Z % BW)))
/// fshr(X,Y,Z): (X << (BW - (Z % BW))) \| (Y >> (Z % BW))		/// fshr(X,Y,Z): (X << (BW - (Z % BW))) \| (Y >> (Z % BW))
SHL, SRA, SRL, ROTL, ROTR, FSHL, FSHR,		SHL, SRA, SRL, ROTL, ROTR, FSHL, FSHR,
		VP_SHL, VP_SRA, VP_SRL,

/// Byte Swap and Counting operators.		/// Byte Swap and Counting operators.
BSWAP, CTTZ, CTLZ, CTPOP, BITREVERSE,		BSWAP, CTTZ, CTLZ, CTPOP, BITREVERSE,

/// Bit counting operators with an undefined result for zero inputs.		/// Bit counting operators with an undefined result for zero inputs.
CTTZ_ZERO_UNDEF, CTLZ_ZERO_UNDEF,		CTTZ_ZERO_UNDEF, CTLZ_ZERO_UNDEF,

/// Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not		/// Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not
/// i1 then the high bits must conform to getBooleanContents.		/// i1 then the high bits must conform to getBooleanContents.
SELECT,		SELECT,

/// Select with a vector condition (op #0) and two vector operands (ops #1		/// Select with a vector condition (op #0) and two vector operands (ops #1
/// and #2), returning a vector result. All vectors have the same length.		/// and #2), returning a vector result. All vectors have the same length.
/// Much like the scalar select and setcc, each bit in the condition selects		/// Much like the scalar select and setcc, each bit in the condition selects
/// whether the corresponding result element is taken from op #1 or op #2.		/// whether the corresponding result element is taken from op #1 or op #2.
/// At first, the VSELECT condition is of vXi1 type. Later, targets may		/// At first, the VSELECT condition is of vXi1 type. Later, targets may
/// change the condition type in order to match the VSELECT node using a		/// change the condition type in order to match the VSELECT node using a
/// pattern. The condition follows the BooleanContent format of the target.		/// pattern. The condition follows the BooleanContent format of the target.
VSELECT,		VSELECT,
		VP_SELECT,

		/// Select with an integer pivot (op #0) and two vector operands (ops #1
		SjoerdMeijerUnsubmitted Done Reply Inline Actions I was unfamiliar with this one... I think I know what it does, and how it is different from VP_SELECT, but for clarity, can you define what `integer pivot` is? SjoerdMeijer: I was unfamiliar with this one... I think I know what it does, and how it is different from…
		/// and #2), returning a vector result. Op #3 is the vector length, all
		/// vectors have the same length.
		/// Vector element below the pivot (op #0) are taken from op #1, elements
		SjoerdMeijerUnsubmitted Done Reply Inline Actions typo: hether SjoerdMeijer: typo: hether
		/// at positions greater-equal than the pivot are taken from op #2.
		VP_COMPOSE,

/// Select with condition operator - This selects between a true value and		/// Select with condition operator - This selects between a true value and
/// a false value (ops #2 and #3) based on the boolean result of comparing		/// a false value (ops #2 and #3) based on the boolean result of comparing
/// the lhs and rhs (ops #0 and #1) of a conditional expression with the		/// the lhs and rhs (ops #0 and #1) of a conditional expression with the
/// condition code in op #4, a CondCodeSDNode.		/// condition code in op #4, a CondCodeSDNode.
SELECT_CC,		SELECT_CC,

/// SetCC operator - This evaluates to a true value iff the condition is		/// SetCC operator - This evaluates to a true value iff the condition is
/// true. If the result value type is not i1 then the high bits conform		/// true. If the result value type is not i1 then the high bits conform
/// to getBooleanContents. The operands to this are the left and right		/// to getBooleanContents. The operands to this are the left and right
/// operands to compare (ops #0, and #1) and the condition code to compare		/// operands to compare (ops #0, and #1) and the condition code to compare
/// them with (op #2) as a CondCodeSDNode. If the operands are vector types		/// them with (op #2) as a CondCodeSDNode. If the operands are vector types
/// then the result type must also be a vector type.		/// then the result type must also be a vector type.
SETCC,		SETCC,
		VP_SETCC,

/// Like SetCC, ops #0 and #1 are the LHS and RHS operands to compare, but		/// Like SetCC, ops #0 and #1 are the LHS and RHS operands to compare, but
/// op #2 is a boolean indicating if there is an incoming carry. This		/// op #2 is a boolean indicating if there is an incoming carry. This
/// operator checks the result of "LHS - RHS - Carry", and can be used to		/// operator checks the result of "LHS - RHS - Carry", and can be used to
/// compare two wide integers:		/// compare two wide integers:
/// (setcccarry lhshi rhshi (subcarry lhslo rhslo) cc).		/// (setcccarry lhshi rhshi (subcarry lhslo rhslo) cc).
/// Only valid for integers.		/// Only valid for integers.
SETCCCARRY,		SETCCCARRY,
Show All 20 Lines	enum NodeType {

/// TRUNCATE - Completely drop the high bits.		/// TRUNCATE - Completely drop the high bits.
TRUNCATE,		TRUNCATE,

/// [SU]INT_TO_FP - These operators convert integers (whose interpreted sign		/// [SU]INT_TO_FP - These operators convert integers (whose interpreted sign
/// depends on the first letter) to floating point.		/// depends on the first letter) to floating point.
SINT_TO_FP,		SINT_TO_FP,
UINT_TO_FP,		UINT_TO_FP,
		VP_SINT_TO_FP,
		VP_UINT_TO_FP,

/// SIGN_EXTEND_INREG - This operator atomically performs a SHL/SRA pair to		/// SIGN_EXTEND_INREG - This operator atomically performs a SHL/SRA pair to
/// sign extend a small value in a large integer register (e.g. sign		/// sign extend a small value in a large integer register (e.g. sign
/// extending the low 8 bits of a 32-bit register to fill the top 24 bits		/// extending the low 8 bits of a 32-bit register to fill the top 24 bits
/// with the 7th bit). The size of the smaller type is indicated by the 1th		/// with the 7th bit). The size of the smaller type is indicated by the 1th
/// operand, a ValueType node.		/// operand, a ValueType node.
SIGN_EXTEND_INREG,		SIGN_EXTEND_INREG,

Show All 30 Lines	enum NodeType {
/// the same to allow expansion to shuffle vector during op legalization.		/// the same to allow expansion to shuffle vector during op legalization.
ZERO_EXTEND_VECTOR_INREG,		ZERO_EXTEND_VECTOR_INREG,

/// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned		/// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned
/// integer. These have the same semantics as fptosi and fptoui in IR. If		/// integer. These have the same semantics as fptosi and fptoui in IR. If
/// the FP value cannot fit in the integer type, the results are undefined.		/// the FP value cannot fit in the integer type, the results are undefined.
FP_TO_SINT,		FP_TO_SINT,
FP_TO_UINT,		FP_TO_UINT,
		VP_FP_TO_SINT,
		VP_FP_TO_UINT,

/// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type		/// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type
/// down to the precision of the destination VT. TRUNC is a flag, which is		/// down to the precision of the destination VT. TRUNC is a flag, which is
/// always an integer that is zero or one. If TRUNC is 0, this is a		/// always an integer that is zero or one. If TRUNC is 0, this is a
/// normal rounding, if it is 1, this FP_ROUND is known to not change the		/// normal rounding, if it is 1, this FP_ROUND is known to not change the
/// value of Y.		/// value of Y.
///		///
/// The TRUNC = 1 case is used in cases where we know that the value will		/// The TRUNC = 1 case is used in cases where we know that the value will
/// not be modified by the node, because Y is not using any of the extra		/// not be modified by the node, because Y is not using any of the extra
/// precision of source type. This allows certain transformations like		/// precision of source type. This allows certain transformations like
/// FP_EXTEND(FP_ROUND(X,1)) -> X which are not safe for		/// FP_EXTEND(FP_ROUND(X,1)) -> X which are not safe for
/// FP_EXTEND(FP_ROUND(X,0)) because the extra bits aren't removed.		/// FP_EXTEND(FP_ROUND(X,0)) because the extra bits aren't removed.
FP_ROUND,		FP_ROUND,

/// FLT_ROUNDS_ - Returns current rounding mode:		/// FLT_ROUNDS_ - Returns current rounding mode:
/// -1 Undefined		/// -1 Undefined
/// 0 Round to 0		/// 0 Round to 0
/// 1 Round to nearest		/// 1 Round to nearest
/// 2 Round to +inf		/// 2 Round to +inf
/// 3 Round to -inf		/// 3 Round to -inf
FLT_ROUNDS_,		FLT_ROUNDS_,

/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.		/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.
FP_EXTEND,		FP_EXTEND,
		VP_FP_EXTEND,

/// BITCAST - This operator converts between integer, vector and FP		/// BITCAST - This operator converts between integer, vector and FP
/// values, as if the value was stored to memory with one type and loaded		/// values, as if the value was stored to memory with one type and loaded
/// from the same address with the other type (or equivalently for vector		/// from the same address with the other type (or equivalently for vector
/// format conversions, etc). The source and result are required to have		/// format conversions, etc). The source and result are required to have
/// the same bit size (e.g. f32 <-> i32). This can also be used for		/// the same bit size (e.g. f32 <-> i32). This can also be used for
/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by		/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by
/// getNode().		/// getNode().
Show All 19 Lines	enum NodeType {
/// Perform various unary floating-point operations inspired by libm. For		/// Perform various unary floating-point operations inspired by libm. For
/// FPOWI, the result is undefined if if the integer operand doesn't fit		/// FPOWI, the result is undefined if if the integer operand doesn't fit
/// into 32 bits.		/// into 32 bits.
FNEG, FABS, FSQRT, FCBRT, FSIN, FCOS, FPOWI, FPOW,		FNEG, FABS, FSQRT, FCBRT, FSIN, FCOS, FPOWI, FPOW,
FLOG, FLOG2, FLOG10, FEXP, FEXP2,		FLOG, FLOG2, FLOG10, FEXP, FEXP2,
FCEIL, FTRUNC, FRINT, FNEARBYINT, FROUND, FFLOOR,		FCEIL, FTRUNC, FRINT, FNEARBYINT, FROUND, FFLOOR,
LROUND, LLROUND, LRINT, LLRINT,		LROUND, LLROUND, LRINT, LLRINT,

		VP_FNEG, VP_FABS, VP_FSQRT, VP_FCBRT, VP_FSIN, VP_FCOS, VP_FPOWI, VP_FPOW,
		VP_FLOG, VP_FLOG2, VP_FLOG10, VP_FEXP, VP_FEXP2,
		VP_FCEIL, VP_FTRUNC, VP_FRINT, VP_FNEARBYINT, VP_FROUND, VP_FFLOOR,
		VP_LROUND, VP_LLROUND, VP_LRINT, VP_LLRINT,
/// FMINNUM/FMAXNUM - Perform floating-point minimum or maximum on two		/// FMINNUM/FMAXNUM - Perform floating-point minimum or maximum on two
/// values.		/// values.
//		//
/// In the case where a single input is a NaN (either signaling or quiet),		/// In the case where a single input is a NaN (either signaling or quiet),
/// the non-NaN input is returned.		/// the non-NaN input is returned.
///		///
/// The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.		/// The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.
FMINNUM, FMAXNUM,		FMINNUM, FMAXNUM,
		VP_FMINNUM, VP_FMAXNUM,

/// FMINNUM_IEEE/FMAXNUM_IEEE - Perform floating-point minimum or maximum on		/// FMINNUM_IEEE/FMAXNUM_IEEE - Perform floating-point minimum or maximum on
/// two values, following the IEEE-754 2008 definition. This differs from		/// two values, following the IEEE-754 2008 definition. This differs from
/// FMINNUM/FMAXNUM in the handling of signaling NaNs. If one input is a		/// FMINNUM/FMAXNUM in the handling of signaling NaNs. If one input is a
/// signaling NaN, returns a quiet NaN.		/// signaling NaN, returns a quiet NaN.
FMINNUM_IEEE, FMAXNUM_IEEE,		FMINNUM_IEEE, FMAXNUM_IEEE,

/// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0		/// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0
▲ Show 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	enum NodeType {

// Masked load and store - consecutive vector load and store operations		// Masked load and store - consecutive vector load and store operations
// with additional mask operand that prevents memory accesses to the		// with additional mask operand that prevents memory accesses to the
// masked-off lanes.		// masked-off lanes.
//		//
// Val, OutChain = MLOAD(BasePtr, Mask, PassThru)		// Val, OutChain = MLOAD(BasePtr, Mask, PassThru)
// OutChain = MSTORE(Value, BasePtr, Mask)		// OutChain = MSTORE(Value, BasePtr, Mask)
MLOAD, MSTORE,		MLOAD, MSTORE,
		VP_LOAD, VP_STORE,

// Masked gather and scatter - load and store operations for a vector of		// Masked gather and scatter - load and store operations for a vector of
// random addresses with additional mask operand that prevents memory		// random addresses with additional mask operand that prevents memory
// accesses to the masked-off lanes.		// accesses to the masked-off lanes.
//		//
// Val, OutChain = GATHER(InChain, PassThru, Mask, BasePtr, Index, Scale)		// Val, OutChain = GATHER(InChain, PassThru, Mask, BasePtr, Index, Scale)
// OutChain = SCATTER(InChain, Value, Mask, BasePtr, Index, Scale)		// OutChain = SCATTER(InChain, Value, Mask, BasePtr, Index, Scale)
//		//
// The Index operand can have more vector elements than the other operands		// The Index operand can have more vector elements than the other operands
// due to type legalization. The extra elements are ignored.		// due to type legalization. The extra elements are ignored.
MGATHER, MSCATTER,		MGATHER, MSCATTER,

		// VP gather and scatter - load and store operations for a vector of
		// random addresses with additional mask and vector length operand that
		// prevents memory accesses to the masked-off lanes.
		//
		// Val, OutChain = VP_GATHER(InChain, BasePtr, Index, Scale, Mask, EVL)
		// OutChain = VP_SCATTER(InChain, Value, BasePtr, Index, Scale, Mask, EVL)
		//
		// The Index operand can have more vector elements than the other operands
		// due to type legalization. The extra elements are ignored.
		VP_GATHER, VP_SCATTER,

/// This corresponds to the llvm.lifetime.* intrinsics. The first operand		/// This corresponds to the llvm.lifetime.* intrinsics. The first operand
/// is the chain and the second operand is the alloca pointer.		/// is the chain and the second operand is the alloca pointer.
LIFETIME_START, LIFETIME_END,		LIFETIME_START, LIFETIME_END,

/// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the		/// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the
/// beginning and end of GC transition sequence, and carry arbitrary		/// beginning and end of GC transition sequence, and carry arbitrary
/// information that target might need for lowering. The first operand is		/// information that target might need for lowering. The first operand is
/// a chain, the rest are specified by the target and not touched by the DAG		/// a chain, the rest are specified by the target and not touched by the DAG
Show All 25 Lines	enum NodeType {
VECREDUCE_FMAX, VECREDUCE_FMIN,		VECREDUCE_FMAX, VECREDUCE_FMIN,
/// Integer reductions may have a result type larger than the vector element		/// Integer reductions may have a result type larger than the vector element
/// type. However, the reduction is performed using the vector element type		/// type. However, the reduction is performed using the vector element type
/// and the value in the top bits is unspecified.		/// and the value in the top bits is unspecified.
VECREDUCE_ADD, VECREDUCE_MUL,		VECREDUCE_ADD, VECREDUCE_MUL,
VECREDUCE_AND, VECREDUCE_OR, VECREDUCE_XOR,		VECREDUCE_AND, VECREDUCE_OR, VECREDUCE_XOR,
VECREDUCE_SMAX, VECREDUCE_SMIN, VECREDUCE_UMAX, VECREDUCE_UMIN,		VECREDUCE_SMAX, VECREDUCE_SMIN, VECREDUCE_UMAX, VECREDUCE_UMIN,

		VP_REDUCE_FADD, VP_REDUCE_FMUL,
		VP_REDUCE_ADD, VP_REDUCE_MUL,
		VP_REDUCE_AND, VP_REDUCE_OR, VP_REDUCE_XOR,
		VP_REDUCE_SMAX, VP_REDUCE_SMIN, VP_REDUCE_UMAX, VP_REDUCE_UMIN,

		/// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants.
		VP_REDUCE_FMAX, VP_REDUCE_FMIN,

/// BUILTIN_OP_END - This must be the last enum value in this list.		/// BUILTIN_OP_END - This must be the last enum value in this list.
/// The target-specific pre-isel opcode values start here.		/// The target-specific pre-isel opcode values start here.
BUILTIN_OP_END		BUILTIN_OP_END
};		};

/// FIRST_TARGET_STRICTFP_OPCODE - Target-specific pre-isel operations		/// FIRST_TARGET_STRICTFP_OPCODE - Target-specific pre-isel operations
/// which cannot raise FP exceptions should be less than this value.		/// which cannot raise FP exceptions should be less than this value.
/// Those that do must not be less than this value.		/// Those that do must not be less than this value.
static const int FIRST_TARGET_STRICTFP_OPCODE = BUILTIN_OP_END+400;		static const int FIRST_TARGET_STRICTFP_OPCODE = BUILTIN_OP_END+400;

/// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations		/// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations
/// which do not reference a specific memory location should be less than		/// which do not reference a specific memory location should be less than
/// this value. Those that do must not be less than this value, and can		/// this value. Those that do must not be less than this value, and can
/// be used with SelectionDAG::getMemIntrinsicNode.		/// be used with SelectionDAG::getMemIntrinsicNode.
static const int FIRST_TARGET_MEMORY_OPCODE = BUILTIN_OP_END+500;		static const int FIRST_TARGET_MEMORY_OPCODE = BUILTIN_OP_END+500;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
/// MemIndexedMode enum - This enum defines the load / store indexed		/// MemIndexedMode enum - This enum defines the load / store indexed
/// addressing modes.		/// addressing modes.
///		///
/// UNINDEXED "Normal" load / store. The effective address is already		/// UNINDEXED "Normal" load / store. The effective address is already
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// UNINDEXED "Normal" load / store. The effective address is already - /// computed and is available in the base pointer. The offset - /// operand is always undefined. In addition to producing a - /// chain, an unindexed load produces one value (result of the - /// load); an unindexed store does not produce a value. + /// The TRUNC = 1 case is used in cases where we know that the value will + /// not be modified by the node, because Y is not using any of the extra + /// precision of source type. This allows certain transformations like + /// STRICT_FP_EXTEND(STRICT_FP_ROUND(X,1)) -> X which are not safe for + /// STRICT_FP_EXTEND(STRICT_FP_ROUND(X,0)) because the extra bits aren't + /// removed. + /// It is used to limit optimizations while the DAG is being optimized. + STRICT_FP_ROUND, + + /// X = STRICT_FP_EXTEND(Y) - Extend a smaller FP type into a larger FP + /// type. + /// It is used to limit optimizations while the DAG is being optimized. + STRICT_FP_EXTEND, + + /// STRICT_FSETCC/STRICT_FSETCCS - Constrained versions of SETCC, used + /// for floating-point operands only. STRICT_FSETCC performs a quiet + /// comparison operation, while STRICT_FSETCCS performs a signaling + /// comparison operation. + STRICT_FSETCC, + STRICT_FSETCCS, + + /// FMA - Perform a * b + c with no intermediate rounding step. + FMA, + VP_FMA, + + /// FMAD - Perform a * b + c, while getting the same result as the + /// separately rounded operations. + FMAD, + + /// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This + /// DAG node does not require that X and Y have the same type, just that + /// they are both floating point. X and the result must have the same type. + /// FCOPYSIGN(f32, f64) is allowed. + FCOPYSIGN, + + /// INT = FGETSIGN(FP) - Return the sign bit of the specified floating point + /// value as an integer 0/1 value. + FGETSIGN, + + /// Returns platform specific canonical encoding of a floating point number. + FCANONICALIZE, + + /// BUILD_VECTOR(ELT0, ELT1, ELT2, ELT3,...) - Return a vector with the + /// specified, possibly variable, elements. The number of elements is + /// required to be a power of two. The types of the operands must all be + /// the same and must match the vector element type, except that integer + /// types are allowed to be larger than the element type, in which case + /// the operands are implicitly truncated. + BUILD_VECTOR, + + /// INSERT_VECTOR_ELT(VECTOR, VAL, IDX) - Returns VECTOR with the element + /// at IDX replaced with VAL. If the type of VAL is larger than the vector + /// element type then VAL is truncated before replacement. + INSERT_VECTOR_ELT, + + /// EXTRACT_VECTOR_ELT(VECTOR, IDX) - Returns a single element from VECTOR + /// identified by the (potentially variable) element number IDX. If the + /// return type is an integer type larger than the element type of the + /// vector, the result is extended to the width of the return type. In + /// that case, the high bits are undefined. + EXTRACT_VECTOR_ELT, + + /// CONCAT_VECTORS(VECTOR0, VECTOR1, ...) - Given a number of values of + /// vector type with the same length and element type, this produces a + /// concatenated vector result value, with length equal to the sum of the + /// lengths of the input vectors. + CONCAT_VECTORS, + + /// INSERT_SUBVECTOR(VECTOR1, VECTOR2, IDX) - Returns a vector + /// with VECTOR2 inserted into VECTOR1 at the (potentially + /// variable) element number IDX, which must be a multiple of the + /// VECTOR2 vector length. The elements of VECTOR1 starting at + /// IDX are overwritten with VECTOR2. Elements IDX through + /// vector_length(VECTOR2) must be valid VECTOR1 indices. + INSERT_SUBVECTOR, + + /// EXTRACT_SUBVECTOR(VECTOR, IDX) - Returns a subvector from VECTOR (an + /// vector value) starting with the element number IDX, which must be a + /// constant multiple of the result vector length. + EXTRACT_SUBVECTOR, + + /// VECTOR_SHUFFLE(VEC1, VEC2) - Returns a vector, of the same type as + /// VEC1/VEC2. A VECTOR_SHUFFLE node also contains an array of constant int + /// values that indicate which value (or undef) each result element will + /// get. These constant ints are accessible through the + /// ShuffleVectorSDNode class. This is quite similar to the Altivec + /// 'vperm' instruction, except that the indices must be constants and are + /// in terms of the element size of VEC1/VEC2, not in terms of bytes. + VECTOR_SHUFFLE, + + /// VP_VSHIFT(VEC1, AMOUNT, MASK, VLEN) - Returns a vector, of the same type + /// as + /// VEC1. AMOUNT is an integer value. The returned vector is equivalent + /// to VEC1 shifted by AMOUNT (RETURNED_VEC[idx] = VEC1[idx + AMOUNT]). + VP_VSHIFT, + + /// VP_COMPRESS(VEC1, MASK, VLEN) - Returns a vector, of the same type as + /// VEC1. + VP_COMPRESS, + + /// VP_EXPAND(VEC1, MASK, VLEN) - Returns a vector, of the same type as + /// VEC1. + VP_EXPAND, + + /// SCALAR_TO_VECTOR(VAL) - This represents the operation of loading a + /// scalar value into element 0 of the resultant vector type. The top + /// elements 1 to N-1 of the N-element vector are undefined. The type + /// of the operand must match the vector element type, except when they + /// are integer types. In this case the operand is allowed to be wider + /// than the vector element type, and is implicitly truncated to it. + SCALAR_TO_VECTOR, + + /// SPLAT_VECTOR(VAL) - Returns a vector with the scalar value VAL + /// duplicated in all lanes. The type of the operand must match the vector + /// element type, except when they are integer types. In this case the + /// operand is allowed to be wider than the vector element type, and is + /// implicitly truncated to it. + SPLAT_VECTOR, + + /// MULHU/MULHS - Multiply high - Multiply two integers of type iN, + /// producing an unsigned/signed value of type i[2N], then return the top + /// part. + MULHU, + MULHS, + + /// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned + /// integers. + SMIN, + SMAX, + UMIN, + UMAX, + + /// Bitwise operators - logical and, logical or, logical xor. + AND, + OR, + XOR, + VP_AND, + VP_OR, + VP_XOR, + + /// ABS - Determine the unsigned absolute value of a signed integer value of + /// the same bitwidth. + /// Note: A value of INT_MIN will return INT_MIN, no saturation or overflow + /// is performed. + ABS, + + /// Shift and rotation operations. After legalization, the type of the + /// shift amount is known to be TLI.getShiftAmountTy(). Before legalization + /// the shift amount can be any type, but care must be taken to ensure it is + /// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before + /// legalization, types like i1024 can occur and i8 doesn't have enough bits + /// to represent the shift amount. + /// When the 1st operand is a vector, the shift amount must be in the same + /// type. (TLI.getShiftAmountTy() will return the same type when the input + /// type is a vector.) + /// For rotates and funnel shifts, the shift amount is treated as an unsigned + /// amount modulo the element size of the first operand. Lint: Pre-merge checks:* clang-format: please reformat the code ``` - /// UNINDEXED "Normal" load / store. The…
/// computed and is available in the base pointer. The offset		/// computed and is available in the base pointer. The offset
/// operand is always undefined. In addition to producing a		/// operand is always undefined. In addition to producing a
/// chain, an unindexed load produces one value (result of the		/// chain, an unindexed load produces one value (result of the
/// load); an unindexed store does not produce a value.		/// load); an unindexed store does not produce a value.
///		///
/// PRE_INC Similar to the unindexed mode where the effective address is		/// PRE_INC Similar to the unindexed mode where the effective address is
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// PRE_INC Similar to the unindexed mode where the effective address is - /// PRE_DEC the value of the base pointer add / subtract the offset. - /// It considers the computation as being folded into the load / - /// store operation (i.e. the load / store does the address - /// computation as well as performing the memory transaction). - /// The base operand is always undefined. In addition to - /// producing a chain, pre-indexed load produces two values - /// (result of the load and the result of the address - /// computation); a pre-indexed store produces one value (result - /// of the address computation). + /// Funnel 'double' shifts take 3 operands, 2 inputs and the shift amount. + /// fshl(X,Y,Z): (X << (Z % BW)) \| (Y >> (BW - (Z % BW))) + /// fshr(X,Y,Z): (X << (BW - (Z % BW))) \| (Y >> (Z % BW)) + SHL, + SRA, + SRL, + ROTL, + ROTR, + FSHL, + FSHR, + VP_SHL, + VP_SRA, + VP_SRL, + + /// Byte Swap and Counting operators. + BSWAP, + CTTZ, + CTLZ, + CTPOP, + BITREVERSE, + + /// Bit counting operators with an undefined result for zero inputs. + CTTZ_ZERO_UNDEF, + CTLZ_ZERO_UNDEF, + + /// Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not + /// i1 then the high bits must conform to getBooleanContents. + SELECT, + + /// Select with a vector condition (op #0) and two vector operands (ops #1 + /// and #2), returning a vector result. All vectors have the same length. + /// Much like the scalar select and setcc, each bit in the condition selects + /// whether the corresponding result element is taken from op #1 or op #2. + /// At first, the VSELECT condition is of vXi1 type. Later, targets may + /// change the condition type in order to match the VSELECT node using a + /// pattern. The condition follows the BooleanContent format of the target. + VSELECT, + VP_SELECT, + + /// Select with an integer pivot (op #0) and two vector operands (ops #1 + /// and #2), returning a vector result. Op #3 is the vector length, all + /// vectors have the same length. + /// Vector element below the pivot (op #0) are taken from op #1, elements + /// at positions greater-equal than the pivot are taken from op #2. + VP_COMPOSE, + + /// Select with condition operator - This selects between a true value and + /// a false value (ops #2 and #3) based on the boolean result of comparing + /// the lhs and rhs (ops #0 and #1) of a conditional expression with the + /// condition code in op #4, a CondCodeSDNode. + SELECT_CC, + + /// SetCC operator - This evaluates to a true value iff the condition is + /// true. If the result value type is not i1 then the high bits conform + /// to getBooleanContents. The operands to this are the left and right + /// operands to compare (ops #0, and #1) and the condition code to compare + /// them with (op #2) as a CondCodeSDNode. If the operands are vector types + /// then the result type must also be a vector type. + SETCC, + VP_SETCC, + + /// Like SetCC, ops #0 and #1 are the LHS and RHS operands to compare, but + /// op #2 is a boolean indicating if there is an incoming carry. This + /// operator checks the result of "LHS - RHS - Carry", and can be used to + /// compare two wide integers: + /// (setcccarry lhshi rhshi (subcarry lhslo rhslo) cc). + /// Only valid for integers. + SETCCCARRY, + + /// SHL_PARTS/SRA_PARTS/SRL_PARTS - These operators are used for expanded + /// integer shift operations. The operation ordering is: + /// [Lo,Hi] = op [LoLHS,HiLHS], Amt + SHL_PARTS, + SRA_PARTS, + SRL_PARTS, + + /// Conversion operators. These are all single input single output + /// operations. For all of these, the result type must be strictly + /// wider or narrower (depending on the operation) than the source + /// type. + + /// SIGN_EXTEND - Used for integer types, replicating the sign bit + /// into new bits. + SIGN_EXTEND, + + /// ZERO_EXTEND - Used for integer types, zeroing the new bits. + ZERO_EXTEND, + + /// ANY_EXTEND - Used for integer types. The high bits are undefined. + ANY_EXTEND, + + /// TRUNCATE - Completely drop the high bits. + TRUNCATE, + + /// [SU]INT_TO_FP - These operators convert integers (whose interpreted sign + /// depends on the first letter) to floating point. + SINT_TO_FP, + UINT_TO_FP, + VP_SINT_TO_FP, + VP_UINT_TO_FP, + + /// SIGN_EXTEND_INREG - This operator atomically performs a SHL/SRA pair to + /// sign extend a small value in a large integer register (e.g. sign + /// extending the low 8 bits of a 32-bit register to fill the top 24 bits + /// with the 7th bit). The size of the smaller type is indicated by the 1th + /// operand, a ValueType node. + SIGN_EXTEND_INREG, + + /// ANY_EXTEND_VECTOR_INREG(Vector) - This operator represents an + /// in-register any-extension of the low lanes of an integer vector. The + /// result type must have fewer elements than the operand type, and those + /// elements must be larger integer types such that the total size of the + /// operand type is less than or equal to the size of the result type. Each + /// of the low operand elements is any-extended into the corresponding, + /// wider result elements with the high bits becoming undef. + /// NOTE: The type legalizer prefers to make the operand and result size + /// the same to allow expansion to shuffle vector during op legalization. + ANY_EXTEND_VECTOR_INREG, + + /// SIGN_EXTEND_VECTOR_INREG(Vector) - This operator represents an + /// in-register sign-extension of the low lanes of an integer vector. The + /// result type must have fewer elements than the operand type, and those + /// elements must be larger integer types such that the total size of the + /// operand type is less than or equal to the size of the result type. Each + /// of the low operand elements is sign-extended into the corresponding, + /// wider result elements. + /// NOTE: The type legalizer prefers to make the operand and result size + /// the same to allow expansion to shuffle vector during op legalization. + SIGN_EXTEND_VECTOR_INREG, + + /// ZERO_EXTEND_VECTOR_INREG(Vector) - This operator represents an + /// in-register zero-extension of the low lanes of an integer vector. The + /// result type must have fewer elements than the operand type, and those + /// elements must be larger integer types such that the total size of the + /// operand type is less than or equal to the size of the result type. Each + /// of the low operand elements is zero-extended into the corresponding, + /// wider result elements. + /// NOTE: The type legalizer prefers to make the operand and result size + /// the same to allow expansion to shuffle vector during op legalization. + ZERO_EXTEND_VECTOR_INREG, + + /// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned + /// integer. These have the same semantics as fptosi and fptoui in IR. If + /// the FP value cannot fit in the integer type, the results are undefined. + FP_TO_SINT, + FP_TO_UINT, + VP_FP_TO_SINT, + VP_FP_TO_UINT, + + /// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type + /// down to the precision of the destination VT. TRUNC is a flag, which is + /// always an integer that is zero or one. If TRUNC is 0, this is a + /// normal rounding, if it is 1, this FP_ROUND is known to not change the + /// value of Y. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// PRE_INC Similar to the unindexed mode…
/// PRE_DEC the value of the base pointer add / subtract the offset.		/// PRE_DEC the value of the base pointer add / subtract the offset.
/// It considers the computation as being folded into the load /		/// It considers the computation as being folded into the load /
/// store operation (i.e. the load / store does the address		/// store operation (i.e. the load / store does the address
/// computation as well as performing the memory transaction).		/// computation as well as performing the memory transaction).
/// The base operand is always undefined. In addition to		/// The base operand is always undefined. In addition to
/// producing a chain, pre-indexed load produces two values		/// producing a chain, pre-indexed load produces two values
/// (result of the load and the result of the address		/// (result of the load and the result of the address
/// computation); a pre-indexed store produces one value (result		/// computation); a pre-indexed store produces one value (result
/// of the address computation).		/// of the address computation).
///		///
/// POST_INC The effective address is the value of the base pointer. The		/// POST_INC The effective address is the value of the base pointer. The
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// POST_INC The effective address is the value of the base pointer. The - /// POST_DEC value of the offset operand is then added to / subtracted - /// from the base after memory transaction. In addition to - /// producing a chain, post-indexed load produces two values - /// (the result of the load and the result of the base +/- offset - /// computation); a post-indexed store produces one value (the - /// the result of the base +/- offset computation). - enum MemIndexedMode { - UNINDEXED = 0, - PRE_INC, - PRE_DEC, - POST_INC, - POST_DEC - }; - - static const int LAST_INDEXED_MODE = POST_DEC + 1; - - //===--------------------------------------------------------------------===// - /// MemIndexType enum - This enum defines how to interpret MGATHER/SCATTER's - /// index parameter when calculating addresses. + /// The TRUNC = 1 case is used in cases where we know that the value will + /// not be modified by the node, because Y is not using any of the extra + /// precision of source type. This allows certain transformations like + /// FP_EXTEND(FP_ROUND(X,1)) -> X which are not safe for + /// FP_EXTEND(FP_ROUND(X,0)) because the extra bits aren't removed. + FP_ROUND, + + /// FLT_ROUNDS_ - Returns current rounding mode: + /// -1 Undefined + /// 0 Round to 0 + /// 1 Round to nearest + /// 2 Round to +inf + /// 3 Round to -inf + FLT_ROUNDS_, + + /// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type. + FP_EXTEND, + VP_FP_EXTEND, + + /// BITCAST - This operator converts between integer, vector and FP + /// values, as if the value was stored to memory with one type and loaded + /// from the same address with the other type (or equivalently for vector + /// format conversions, etc). The source and result are required to have + /// the same bit size (e.g. f32 <-> i32). This can also be used for + /// int-to-int or fp-to-fp conversions, but that is a noop, deleted by + /// getNode(). Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// POST_INC The effective address is the…
/// POST_DEC value of the offset operand is then added to / subtracted		/// POST_DEC value of the offset operand is then added to / subtracted
/// from the base after memory transaction. In addition to		/// from the base after memory transaction. In addition to
/// producing a chain, post-indexed load produces two values		/// producing a chain, post-indexed load produces two values
/// (the result of the load and the result of the base +/- offset		/// (the result of the load and the result of the base +/- offset
/// computation); a post-indexed store produces one value (the		/// computation); a post-indexed store produces one value (the
/// the result of the base +/- offset computation).		/// the result of the base +/- offset computation).
enum MemIndexedMode {		enum MemIndexedMode {
UNINDEXED = 0,		UNINDEXED = 0,
PRE_INC,		PRE_INC,
PRE_DEC,		PRE_DEC,
POST_INC,		POST_INC,
POST_DEC		POST_DEC
};		};

static const int LAST_INDEXED_MODE = POST_DEC + 1;		static const int LAST_INDEXED_MODE = POST_DEC + 1;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
/// MemIndexType enum - This enum defines how to interpret MGATHER/SCATTER's		/// MemIndexType enum - This enum defines how to interpret MGATHER/SCATTER's
/// index parameter when calculating addresses.		/// index parameter when calculating addresses.
///		///
/// SIGNED_SCALED Addr = Base + ((signed)Index * sizeof(element))		/// SIGNED_SCALED Addr = Base + ((signed)Index * sizeof(element))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// SIGNED_SCALED Addr = Base + ((signed)Index * sizeof(element)) - /// SIGNED_UNSCALED Addr = Base + (signed)Index - /// UNSIGNED_SCALED Addr = Base + ((unsigned)Index * sizeof(element)) - /// UNSIGNED_UNSCALED Addr = Base + (unsigned)Index - enum MemIndexType { - SIGNED_SCALED = 0, - SIGNED_UNSCALED, - UNSIGNED_SCALED, - UNSIGNED_UNSCALED - }; - - static const int LAST_MEM_INDEX_TYPE = UNSIGNED_UNSCALED + 1; - - //===--------------------------------------------------------------------===// - /// LoadExtType enum - This enum defines the three variants of LOADEXT - /// (load with extension). - /// - /// SEXTLOAD loads the integer operand and sign extends it to a larger - /// integer result type. - /// ZEXTLOAD loads the integer operand and zero extends it to a larger - /// integer result type. - /// EXTLOAD is used for two things: floating point extending loads and - /// integer extending loads [the top bits are undefined]. - enum LoadExtType { - NON_EXTLOAD = 0, - EXTLOAD, - SEXTLOAD, - ZEXTLOAD - }; - - static const int LAST_LOADEXT_TYPE = ZEXTLOAD + 1; - - NodeType getExtForLoadExtType(bool IsFP, LoadExtType); - - //===--------------------------------------------------------------------===// - /// ISD::CondCode enum - These are ordered carefully to make the bitfields - /// below work out, when considering SETFALSE (something that never exists - /// dynamically) as 0. "U" -> Unsigned (for integer operands) or Unordered - /// (for floating point), "L" -> Less than, "G" -> Greater than, "E" -> Equal - /// to. If the "N" column is 1, the result of the comparison is undefined if - /// the input is a NAN. + /// This operator is subtly different from the bitcast instruction from + /// LLVM-IR since this node may change the bits in the register. For + /// example, this occurs on big-endian NEON and big-endian MSA where the + /// layout of the bits in the register depends on the vector type and this + /// operator acts as a shuffle operation for some vector type combinations. + BITCAST, + + /// ADDRSPACECAST - This operator converts between pointers of different + /// address spaces. + ADDRSPACECAST, + + /// FP16_TO_FP, FP_TO_FP16 - These operators are used to perform promotions + /// and truncation for half-precision (16 bit) floating numbers. These nodes + /// form a semi-softened interface for dealing with f16 (as an i16), which + /// is often a storage-only type but has native conversions. + FP16_TO_FP, + FP_TO_FP16, + STRICT_FP16_TO_FP, + STRICT_FP_TO_FP16, + + /// Perform various unary floating-point operations inspired by libm. For + /// FPOWI, the result is undefined if if the integer operand doesn't fit + /// into 32 bits. + FNEG, + FABS, + FSQRT, + FCBRT, + FSIN, + FCOS, + FPOWI, + FPOW, + FLOG, + FLOG2, + FLOG10, + FEXP, + FEXP2, + FCEIL, + FTRUNC, + FRINT, + FNEARBYINT, + FROUND, + FFLOOR, + LROUND, + LLROUND, + LRINT, + LLRINT, + + VP_FNEG, + VP_FABS, + VP_FSQRT, + VP_FCBRT, + VP_FSIN, + VP_FCOS, + VP_FPOWI, + VP_FPOW, + VP_FLOG, + VP_FLOG2, + VP_FLOG10, + VP_FEXP, + VP_FEXP2, + VP_FCEIL, + VP_FTRUNC, + VP_FRINT, + VP_FNEARBYINT, + VP_FROUND, + VP_FFLOOR, + VP_LROUND, + VP_LLROUND, + VP_LRINT, + VP_LLRINT, + /// FMINNUM/FMAXNUM - Perform floating-point minimum or maximum on two + /// values. + // + /// In the case where a single input is a NaN (either signaling or quiet), + /// the non-NaN input is returned. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// SIGNED_SCALED Addr = Base +…
/// SIGNED_UNSCALED Addr = Base + (signed)Index		/// SIGNED_UNSCALED Addr = Base + (signed)Index
/// UNSIGNED_SCALED Addr = Base + ((unsigned)Index * sizeof(element))		/// UNSIGNED_SCALED Addr = Base + ((unsigned)Index * sizeof(element))
/// UNSIGNED_UNSCALED Addr = Base + (unsigned)Index		/// UNSIGNED_UNSCALED Addr = Base + (unsigned)Index
enum MemIndexType {		enum MemIndexType {
SIGNED_SCALED = 0,		SIGNED_SCALED = 0,
SIGNED_UNSCALED,		SIGNED_UNSCALED,
UNSIGNED_SCALED,		UNSIGNED_SCALED,
UNSIGNED_UNSCALED		UNSIGNED_UNSCALED
Show All 25 Lines
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
/// ISD::CondCode enum - These are ordered carefully to make the bitfields		/// ISD::CondCode enum - These are ordered carefully to make the bitfields
/// below work out, when considering SETFALSE (something that never exists		/// below work out, when considering SETFALSE (something that never exists
/// dynamically) as 0. "U" -> Unsigned (for integer operands) or Unordered		/// dynamically) as 0. "U" -> Unsigned (for integer operands) or Unordered
/// (for floating point), "L" -> Less than, "G" -> Greater than, "E" -> Equal		/// (for floating point), "L" -> Less than, "G" -> Greater than, "E" -> Equal
/// to. If the "N" column is 1, the result of the comparison is undefined if		/// to. If the "N" column is 1, the result of the comparison is undefined if
/// the input is a NAN.		/// the input is a NAN.
///		///
/// All of these (except for the 'always folded ops') should be handled for		/// All of these (except for the 'always folded ops') should be handled for
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// All of these (except for the 'always folded ops') should be handled for - /// floating point. For integer, only the SETEQ,SETNE,SETLT,SETLE,SETGT, - /// SETGE,SETULT,SETULE,SETUGT, and SETUGE opcodes are used. + /// The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0. + FMINNUM, + FMAXNUM, + VP_FMINNUM, + VP_FMAXNUM, + + /// FMINNUM_IEEE/FMAXNUM_IEEE - Perform floating-point minimum or maximum on + /// two values, following the IEEE-754 2008 definition. This differs from + /// FMINNUM/FMAXNUM in the handling of signaling NaNs. If one input is a + /// signaling NaN, returns a quiet NaN. + FMINNUM_IEEE, + FMAXNUM_IEEE, + + /// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0 + /// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008 + /// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics. + FMINIMUM, + FMAXIMUM, + + /// FSINCOS - Compute both fsin and fcos as a single operation. + FSINCOS, + + /// LOAD and STORE have token chains as their first operand, then the same + /// operands as an LLVM load/store instruction, then an offset node that + /// is added / subtracted from the base pointer to form the address (for + /// indexed memory ops). + LOAD, + STORE, + + /// DYNAMIC_STACKALLOC - Allocate some number of bytes on the stack aligned + /// to a specified boundary. This node always has two return values: a new + /// stack pointer value and a chain. The first operand is the token chain, + /// the second is the number of bytes to allocate, and the third is the + /// alignment boundary. The size is guaranteed to be a multiple of the + /// stack alignment, and the alignment is guaranteed to be bigger than the + /// stack alignment (if required) or 0 to get standard stack alignment. + DYNAMIC_STACKALLOC, + + /// Control flow instructions. These all have token chains. + + /// BR - Unconditional branch. The first operand is the chain + /// operand, the second is the MBB to branch to. + BR, + + /// BRIND - Indirect branch. The first operand is the chain, the second + /// is the value to branch to, which must be of the same type as the + /// target's pointer type. + BRIND, + + /// BR_JT - Jumptable branch. The first operand is the chain, the second + /// is the jumptable index, the last one is the jumptable entry index. + BR_JT, + + /// BRCOND - Conditional branch. The first operand is the chain, the + /// second is the condition, the third is the block to branch to if the + /// condition is true. If the type of the condition is not i1, then the + /// high bits must conform to getBooleanContents. + BRCOND, + + /// BR_CC - Conditional branch. The behavior is like that of SELECT_CC, in + /// that the condition is represented as condition code, and two nodes to + /// compare, rather than as a combined SetCC node. The operands in order + /// are chain, cc, lhs, rhs, block to branch to if condition is true. + BR_CC, + + /// INLINEASM - Represents an inline asm block. This node always has two + /// return values: a chain and a flag result. The inputs are as follows: + /// Operand #0 : Input chain. + /// Operand #1 : a ExternalSymbolSDNode with a pointer to the asm string. + /// Operand #2 : a MDNodeSDNode with the !srcloc metadata. + /// Operand #3 : HasSideEffect, IsAlignStack bits. + /// After this, it is followed by a list of operands with this format: + /// ConstantSDNode: Flags that encode whether it is a mem or not, the + /// of operands that follow, etc. See InlineAsm.h. + /// ... however many operands ... + /// Operand #last: Optional, an incoming flag. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// All of these (except for the 'always folded…
/// floating point. For integer, only the SETEQ,SETNE,SETLT,SETLE,SETGT,		/// floating point. For integer, only the SETEQ,SETNE,SETLT,SETLE,SETGT,
/// SETGE,SETULT,SETULE,SETUGT, and SETUGE opcodes are used.		/// SETGE,SETULT,SETULE,SETUGT, and SETUGE opcodes are used.
///		///
/// Note that these are laid out in a specific order to allow bit-twiddling		/// Note that these are laid out in a specific order to allow bit-twiddling
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// Note that these are laid out in a specific order to allow bit-twiddling - /// to transform conditions. - enum CondCode { - // Opcode N U L G E Intuitive operation - SETFALSE, // 0 0 0 0 Always false (always folded) - SETOEQ, // 0 0 0 1 True if ordered and equal - SETOGT, // 0 0 1 0 True if ordered and greater than - SETOGE, // 0 0 1 1 True if ordered and greater than or equal - SETOLT, // 0 1 0 0 True if ordered and less than - SETOLE, // 0 1 0 1 True if ordered and less than or equal - SETONE, // 0 1 1 0 True if ordered and operands are unequal - SETO, // 0 1 1 1 True if ordered (no nans) - SETUO, // 1 0 0 0 True if unordered: isnan(X) \| isnan(Y) - SETUEQ, // 1 0 0 1 True if unordered or equal - SETUGT, // 1 0 1 0 True if unordered or greater than - SETUGE, // 1 0 1 1 True if unordered, greater than, or equal - SETULT, // 1 1 0 0 True if unordered or less than - SETULE, // 1 1 0 1 True if unordered, less than, or equal - SETUNE, // 1 1 1 0 True if unordered or not equal - SETTRUE, // 1 1 1 1 Always true (always folded) - // Don't care operations: undefined if the input is a nan. - SETFALSE2, // 1 X 0 0 0 Always false (always folded) - SETEQ, // 1 X 0 0 1 True if equal - SETGT, // 1 X 0 1 0 True if greater than - SETGE, // 1 X 0 1 1 True if greater than or equal - SETLT, // 1 X 1 0 0 True if less than - SETLE, // 1 X 1 0 1 True if less than or equal - SETNE, // 1 X 1 1 0 True if not equal - SETTRUE2, // 1 X 1 1 1 Always true (always folded) - - SETCC_INVALID // Marker value. - }; - - /// Return true if this is a setcc instruction that performs a signed - /// comparison when used with integer operands. - inline bool isSignedIntSetCC(CondCode Code) { - return Code == SETGT \|\| Code == SETGE \|\| Code == SETLT \|\| Code == SETLE; + /// The variable width operands are required to represent target addressing + /// modes as a single "operand", even though they may have multiple + /// SDOperands. + INLINEASM, + + /// INLINEASM_BR - Terminator version of inline asm. Used by asm-goto. + INLINEASM_BR, + + /// EH_LABEL - Represents a label in mid basic block used to track + /// locations needed for debug and exception handling tables. These nodes + /// take a chain as input and return a chain. + EH_LABEL, + + /// ANNOTATION_LABEL - Represents a mid basic block label used by + /// annotations. This should remain within the basic block and be ordered + /// with respect to other call instructions, but loads and stores may float + /// past it. + ANNOTATION_LABEL, + + /// CATCHRET - Represents a return from a catch block funclet. Used for + /// MSVC compatible exception handling. Takes a chain operand and a + /// destination basic block operand. + CATCHRET, + + /// CLEANUPRET - Represents a return from a cleanup block funclet. Used for + /// MSVC compatible exception handling. Takes only a chain operand. + CLEANUPRET, + + /// STACKSAVE - STACKSAVE has one operand, an input chain. It produces a + /// value, the same type as the pointer type for the system, and an output + /// chain. + STACKSAVE, + + /// STACKRESTORE has two operands, an input chain and a pointer to restore + /// to it returns an output chain. + STACKRESTORE, + + /// CALLSEQ_START/CALLSEQ_END - These operators mark the beginning and end + /// of a call sequence, and carry arbitrary information that target might + /// want to know. The first operand is a chain, the rest are specified by + /// the target and not touched by the DAG optimizers. + /// Targets that may use stack to pass call arguments define additional + /// operands: + /// - size of the call frame part that must be set up within the + /// CALLSEQ_START..CALLSEQ_END pair, + /// - part of the call frame prepared prior to CALLSEQ_START. + /// Both these parameters must be constants, their sum is the total call + /// frame size. + /// CALLSEQ_START..CALLSEQ_END pairs may not be nested. + CALLSEQ_START, // Beginning of a call sequence + CALLSEQ_END, // End of a call sequence + + /// VAARG - VAARG has four operands: an input chain, a pointer, a SRCVALUE, + /// and the alignment. It returns a pair of values: the vaarg value and a + /// new chain. + VAARG, + + /// VACOPY - VACOPY has 5 operands: an input chain, a destination pointer, + /// a source pointer, a SRCVALUE for the destination, and a SRCVALUE for the + /// source. + VACOPY, + + /// VAEND, VASTART - VAEND and VASTART have three operands: an input chain, + /// pointer, and a SRCVALUE. + VAEND, + VASTART, + + /// SRCVALUE - This is a node type that holds a Value* that is used to + /// make reference to a value in the LLVM IR. + SRCVALUE, + + /// MDNODE_SDNODE - This is a node that holdes an MDNode, which is used to + /// reference metadata in the IR. + MDNODE_SDNODE, + + /// PCMARKER - This corresponds to the pcmarker intrinsic. + PCMARKER, + + /// READCYCLECOUNTER - This corresponds to the readcyclecounter intrinsic. + /// It produces a chain and one i64 value. The only operand is a chain. + /// If i64 is not legal, the result will be expanded into smaller values. + /// Still, it returns an i64, so targets should set legality for i64. + /// The result is the content of the architecture-specific cycle + /// counter-like register (or other high accuracy low latency clock source). + READCYCLECOUNTER, + + /// HANDLENODE node - Used as a handle for various purposes. + HANDLENODE, + + /// INIT_TRAMPOLINE - This corresponds to the init_trampoline intrinsic. It + /// takes as input a token chain, the pointer to the trampoline, the pointer + /// to the nested function, the pointer to pass for the 'nest' parameter, a + /// SRCVALUE for the trampoline and another for the nested function + /// (allowing targets to access the original Function). + /// It produces a token chain as output. + INIT_TRAMPOLINE, + + /// ADJUST_TRAMPOLINE - This corresponds to the adjust_trampoline intrinsic. + /// It takes a pointer to the trampoline and produces a (possibly) new + /// pointer to the same trampoline with platform-specific adjustments + /// applied. The pointer it returns points to an executable block of code. + ADJUST_TRAMPOLINE, + + /// TRAP - Trapping instruction + TRAP, + + /// DEBUGTRAP - Trap intended to get the attention of a debugger. + DEBUGTRAP, + + /// PREFETCH - This corresponds to a prefetch intrinsic. The first operand + /// is the chain. The other operands are the address to prefetch, + /// read / write specifier, locality specifier and instruction / data cache + /// specifier. + PREFETCH, + + /// OUTCHAIN = ATOMIC_FENCE(INCHAIN, ordering, scope) + /// This corresponds to the fence instruction. It takes an input chain, and + /// two integer constants: an AtomicOrdering and a SynchronizationScope. + ATOMIC_FENCE, + + /// Val, OUTCHAIN = ATOMIC_LOAD(INCHAIN, ptr) + /// This corresponds to "load atomic" instruction. + ATOMIC_LOAD, + + /// OUTCHAIN = ATOMIC_STORE(INCHAIN, ptr, val) + /// This corresponds to "store atomic" instruction. + ATOMIC_STORE, + + /// Val, OUTCHAIN = ATOMIC_CMP_SWAP(INCHAIN, ptr, cmp, swap) + /// For double-word atomic operations: + /// ValLo, ValHi, OUTCHAIN = ATOMIC_CMP_SWAP(INCHAIN, ptr, cmpLo, cmpHi, + /// swapLo, swapHi) + /// This corresponds to the cmpxchg instruction. + ATOMIC_CMP_SWAP, + + /// Val, Success, OUTCHAIN + /// = ATOMIC_CMP_SWAP_WITH_SUCCESS(INCHAIN, ptr, cmp, swap) + /// N.b. this is still a strong cmpxchg operation, so + /// Success == "Val == cmp". + ATOMIC_CMP_SWAP_WITH_SUCCESS, + + /// Val, OUTCHAIN = ATOMIC_SWAP(INCHAIN, ptr, amt) + /// Val, OUTCHAIN = ATOMIC_LOAD_[OpName](INCHAIN, ptr, amt) + /// For double-word atomic operations: + /// ValLo, ValHi, OUTCHAIN = ATOMIC_SWAP(INCHAIN, ptr, amtLo, amtHi) + /// ValLo, ValHi, OUTCHAIN = ATOMIC_LOAD_[OpName](INCHAIN, ptr, amtLo, amtHi) + /// These correspond to the atomicrmw instruction. + ATOMIC_SWAP, + ATOMIC_LOAD_ADD, + ATOMIC_LOAD_SUB, + ATOMIC_LOAD_AND, + ATOMIC_LOAD_CLR, + ATOMIC_LOAD_OR, + ATOMIC_LOAD_XOR, + ATOMIC_LOAD_NAND, + ATOMIC_LOAD_MIN, + ATOMIC_LOAD_MAX, + ATOMIC_LOAD_UMIN, + ATOMIC_LOAD_UMAX, + ATOMIC_LOAD_FADD, + ATOMIC_LOAD_FSUB, + + // Masked load and store - consecutive vector load and store operations + // with additional mask operand that prevents memory accesses to the + // masked-off lanes. + // + // Val, OutChain = MLOAD(BasePtr, Mask, PassThru) + // OutChain = MSTORE(Value, BasePtr, Mask) + MLOAD, + MSTORE, + VP_LOAD, + VP_STORE, + + // Masked gather and scatter - load and store operations for a vector of + // random addresses with additional mask operand that prevents memory + // accesses to the masked-off lanes. + // + // Val, OutChain = GATHER(InChain, PassThru, Mask, BasePtr, Index, Scale) + // OutChain = SCATTER(InChain, Value, Mask, BasePtr, Index, Scale) + // + // The Index operand can have more vector elements than the other operands + // due to type legalization. The extra elements are ignored. + MGATHER, + MSCATTER, + + // VP gather and scatter - load and store operations for a vector of + // random addresses with additional mask and vector length operand that + // prevents memory accesses to the masked-off lanes. + // + // Val, OutChain = VP_GATHER(InChain, BasePtr, Index, Scale, Mask, EVL) + // OutChain = VP_SCATTER(InChain, Value, BasePtr, Index, Scale, Mask, EVL) + // + // The Index operand can have more vector elements than the other operands + // due to type legalization. The extra elements are ignored. + VP_GATHER, + VP_SCATTER, + + /// This corresponds to the llvm.lifetime.* intrinsics. The first operand + /// is the chain and the second operand is the alloca pointer. + LIFETIME_START, + LIFETIME_END, + + /// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the + /// beginning and end of GC transition sequence, and carry arbitrary + /// information that target might need for lowering. The first operand is + /// a chain, the rest are specified by the target and not touched by the DAG + /// optimizers. GC_TRANSITION_START..GC_TRANSITION_END pairs may not be + /// nested. + GC_TRANSITION_START, + GC_TRANSITION_END, + + /// GET_DYNAMIC_AREA_OFFSET - get offset from native SP to the address of + /// the most recent dynamic alloca. For most targets that would be 0, but + /// for some others (e.g. PowerPC, PowerPC64) that would be compile-time + /// known nonzero constant. The only operand here is the chain. + GET_DYNAMIC_AREA_OFFSET, + + /// VSCALE(IMM) - Returns the runtime scaling factor used to calculate the + /// number of elements within a scalable vector. IMM is a constant integer + /// multiplier that is applied to the runtime value. + VSCALE, + + /// Generic reduction nodes. These nodes represent horizontal vector + /// reduction operations, producing a scalar result. + /// The STRICT variants perform reductions in sequential order. The first + /// operand is an initial scalar accumulator value, and the second operand + /// is the vector to reduce. + VECREDUCE_STRICT_FADD, + VECREDUCE_STRICT_FMUL, + /// These reductions are non-strict, and have a single vector operand. + VECREDUCE_FADD, + VECREDUCE_FMUL, + /// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants. + VECREDUCE_FMAX, + VECREDUCE_FMIN, + /// Integer reductions may have a result type larger than the vector element + /// type. However, the reduction is performed using the vector element type + /// and the value in the top bits is unspecified. + VECREDUCE_ADD, + VECREDUCE_MUL, + VECREDUCE_AND, + VECREDUCE_OR, + VECREDUCE_XOR, + VECREDUCE_SMAX, + VECREDUCE_SMIN, + VECREDUCE_UMAX, + VECREDUCE_UMIN, + + VP_REDUCE_FADD, + VP_REDUCE_FMUL, + VP_REDUCE_ADD, + VP_REDUCE_MUL, + VP_REDUCE_AND, + VP_REDUCE_OR, + VP_REDUCE_XOR, + VP_REDUCE_SMAX, + VP_REDUCE_SMIN, + VP_REDUCE_UMAX, + VP_REDUCE_UMIN, + + /// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants. + VP_REDUCE_FMAX, + VP_REDUCE_FMIN, + + /// BUILTIN_OP_END - This must be the last enum value in this list. + /// The target-specific pre-isel opcode values start here. + BUILTIN_OP_END +}; + +/// FIRST_TARGET_STRICTFP_OPCODE - Target-specific pre-isel operations +/// which cannot raise FP exceptions should be less than this value. +/// Those that do must not be less than this value. +static const int FIRST_TARGET_STRICTFP_OPCODE = BUILTIN_OP_END + 400; + +/// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations +/// which do not reference a specific memory location should be less than +/// this value. Those that do must not be less than this value, and can +/// be used with SelectionDAG::getMemIntrinsicNode. +static const int FIRST_TARGET_MEMORY_OPCODE = BUILTIN_OP_END + 500; + +//===--------------------------------------------------------------------===// +/// MemIndexedMode enum - This enum defines the load / store indexed +/// addressing modes. +/// +/// UNINDEXED "Normal" load / store. The effective address is already +/// computed and is available in the base pointer. The offset +/// operand is always undefined. In addition to producing a +/// chain, an unindexed load produces one value (result of the +/// load); an unindexed store does not produce a value. +/// +/// PRE_INC Similar to the unindexed mode where the effective address is +/// PRE_DEC the value of the base pointer add / subtract the offset. +/// It considers the computation as being folded into the load / +/// store operation (i.e. the load / store does the address +/// computation as well as performing the memory transaction). +/// The base operand is always undefined. In addition to +/// producing a chain, pre-indexed load produces two values +/// (result of the load and the result of the address +/// computation); a pre-indexed store produces one value (result +/// of the address computation). +/// +/// POST_INC The effective address is the value of the base pointer. The +/// POST_DEC value of the offset operand is then added to / subtracted +/// from the base after memory transaction. In addition to +/// producing a chain, post-indexed load produces two values +/// (the result of the load and the result of the base +/- offset +/// computation); a post-indexed store produces one value (the +/// the result of the base +/- offset computation). +enum MemIndexedMode { UNINDEXED = 0, PRE_INC, PRE_DEC, POST_INC, POST_DEC }; + +static const int LAST_INDEXED_MODE = POST_DEC + 1; + +//===--------------------------------------------------------------------===// +/// MemIndexType enum - This enum defines how to interpret MGATHER/SCATTER's +/// index parameter when calculating addresses. +/// +/// SIGNED_SCALED Addr = Base + ((signed)Index * sizeof(element)) +/// SIGNED_UNSCALED Addr = Base + (signed)Index +/// UNSIGNED_SCALED Addr = Base + ((unsigned)Index * sizeof(element)) +/// UNSIGNED_UNSCALED Addr = Base + (unsigned)Index +enum MemIndexType { + SIGNED_SCALED = 0, + SIGNED_UNSCALED, + UNSIGNED_SCALED, + UNSIGNED_UNSCALED +}; + +static const int LAST_MEM_INDEX_TYPE = UNSIGNED_UNSCALED + 1; + +//===--------------------------------------------------------------------===// +/// LoadExtType enum - This enum defines the three variants of LOADEXT +/// (load with extension). +/// +/// SEXTLOAD loads the integer operand and sign extends it to a larger +/// integer result type. +/// ZEXTLOAD loads the integer operand and zero extends it to a larger +/// integer result type. +/// EXTLOAD is used for two things: floating point extending loads and +/// integer extending loads [the top bits are undefined]. +enum LoadExtType { NON_EXTLOAD = 0, EXTLOAD, SEXTLOAD, ZEXTLOAD }; + +static const int LAST_LOADEXT_TYPE = ZEXTLOAD + 1; + +NodeType getExtForLoadExtType(bool IsFP, LoadExtType); + +//===--------------------------------------------------------------------===// +/// ISD::CondCode enum - These are ordered carefully to make the bitfields +/// below work out, when considering SETFALSE (something that never exists +/// dynamically) as 0. "U" -> Unsigned (for integer operands) or Unordered +/// (for floating point), "L" -> Less than, "G" -> Greater than, "E" -> Equal +/// to. If the "N" column is 1, the result of the comparison is undefined if +/// the input is a NAN. +/// +/// All of these (except for the 'always folded ops') should be handled for +/// floating point. For integer, only the SETEQ,SETNE,SETLT,SETLE,SETGT, +/// SETGE,SETULT,SETULE,SETUGT, and SETUGE opcodes are used. +/// +/// Note that these are laid out in a specific order to allow bit-twiddling +/// to transform conditions. +enum CondCode { + // Opcode N U L G E Intuitive operation + SETFALSE, // 0 0 0 0 Always false (always folded) + SETOEQ, // 0 0 0 1 True if ordered and equal + SETOGT, // 0 0 1 0 True if ordered and greater than + SETOGE, // 0 0 1 1 True if ordered and greater than or equal + SETOLT, // 0 1 0 0 True if ordered and less than + SETOLE, // 0 1 0 1 True if ordered and less than or equal + SETONE, // 0 1 1 0 True if ordered and operands are unequal + SETO, // 0 1 1 1 True if ordered (no nans) + SETUO, // 1 0 0 0 True if unordered: isnan(X) \| isnan(Y) + SETUEQ, // 1 0 0 1 True if unordered or equal + SETUGT, // 1 0 1 0 True if unordered or greater than + SETUGE, // 1 0 1 1 True if unordered, greater than, or equal + SETULT, // 1 1 0 0 True if unordered or less than + SETULE, // 1 1 0 1 True if unordered, less than, or equal + SETUNE, // 1 1 1 0 True if unordered or not equal + SETTRUE, // 1 1 1 1 Always true (always folded) + // Don't care operations: undefined if the input is a nan. + SETFALSE2, // 1 X 0 0 0 Always false (always folded) + SETEQ, // 1 X 0 0 1 True if equal + SETGT, // 1 X 0 1 0 True if greater than + SETGE, // 1 X 0 1 1 True if greater than or equal + SETLT, // 1 X 1 0 0 True if less than + SETLE, // 1 X 1 0 1 True if less than or equal + SETNE, // 1 X 1 1 0 True if not equal + SETTRUE2, // 1 X 1 1 1 Always true (always folded) + + SETCC_INVALID // Marker value. +}; + +/// Return true if this is a setcc instruction that performs a signed +/// comparison when used with integer operands. +inline bool isSignedIntSetCC(CondCode Code) { + return Code == SETGT \|\| Code == SETGE \|\| Code == SETLT \|\| Code == SETLE; Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// Note that these are laid out in a specific…
/// to transform conditions.		/// to transform conditions.
enum CondCode {		enum CondCode {
// Opcode N U L G E Intuitive operation		// Opcode N U L G E Intuitive operation
SETFALSE, // 0 0 0 0 Always false (always folded)		SETFALSE, // 0 0 0 0 Always false (always folded)
SETOEQ, // 0 0 0 1 True if ordered and equal		SETOEQ, // 0 0 0 1 True if ordered and equal
SETOGT, // 0 0 1 0 True if ordered and greater than		SETOGT, // 0 0 1 0 True if ordered and greater than
SETOGE, // 0 0 1 1 True if ordered and greater than or equal		SETOGE, // 0 0 1 1 True if ordered and greater than or equal
SETOLT, // 0 1 0 0 True if ordered and less than		SETOLT, // 0 1 0 0 True if ordered and less than
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
/// SETCC_INVALID if it is not possible to represent the resultant comparison.		/// SETCC_INVALID if it is not possible to represent the resultant comparison.
CondCode getSetCCOrOperation(CondCode Op1, CondCode Op2, EVT Type);		CondCode getSetCCOrOperation(CondCode Op1, CondCode Op2, EVT Type);

/// Return the result of a logical AND between different comparisons of		/// Return the result of a logical AND between different comparisons of
/// identical values: ((X op1 Y) & (X op2 Y)). This function returns		/// identical values: ((X op1 Y) & (X op2 Y)). This function returns
/// SETCC_INVALID if it is not possible to represent the resultant comparison.		/// SETCC_INVALID if it is not possible to represent the resultant comparison.
CondCode getSetCCAndOperation(CondCode Op1, CondCode Op2, EVT Type);		CondCode getSetCCAndOperation(CondCode Op1, CondCode Op2, EVT Type);

		/// Return the mask operand of this VP SDNode.
		/// Otherwise, return -1.
		SjoerdMeijerUnsubmitted Done Reply Inline Actions just spell out 'otherwise' here, and also below. SjoerdMeijer: just spell out 'otherwise' here, and also below.
		int GetMaskPosVP(unsigned OpCode);

		/// Return the vector length operand of this VP SDNode.
		/// Otherwise, return -1.
		int GetVectorLengthPosVP(unsigned OpCode);

		/// Translate this VP OpCode to an unpredicated instruction OpCode.
		unsigned GetFunctionOpCodeForVP(unsigned VPOpCode, bool hasFPExcept);

		/// Translate this non-VP Opcode to its corresponding VP Opcode
		unsigned GetVPForFunctionOpCode(unsigned OpCode);

} // end llvm::ISD namespace		} // end llvm::ISD namespace

} // end llvm namespace		} // end llvm namespace

#endif		#endif

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 438 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
/// This pass performs outlining on machine instructions directly before		/// This pass performs outlining on machine instructions directly before
/// printing assembly.		/// printing assembly.
ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);		ModulePass *createMachineOutlinerPass(bool RunOnAllFunctions = true);

/// This pass expands the experimental reduction intrinsics into sequences of		/// This pass expands the experimental reduction intrinsics into sequences of
/// shuffles.		/// shuffles.
FunctionPass *createExpandReductionsPass();		FunctionPass *createExpandReductionsPass();

		/// This pass expands the vector predication intrinsics into unpredicated
		/// instructions with selects or just the explicit vector length into the
		/// predicate mask.
		FunctionPass *createExpandVectorPredicationPass();

// This pass expands memcmp() to load/stores.		// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();		FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();
Show All 17 Lines

llvm/include/llvm/CodeGen/SelectionDAG.h

Show First 20 Lines • Show All 1,043 Lines • ▼ Show 20 Lines	SDValue getSetCC(const SDLoc &DL, EVT VT, SDValue LHS, SDValue RHS,
assert(Cond != ISD::SETCC_INVALID &&		assert(Cond != ISD::SETCC_INVALID &&
"Cannot create a setCC of an invalid node.");		"Cannot create a setCC of an invalid node.");
if (Chain)		if (Chain)
return getNode(IsSignaling ? ISD::STRICT_FSETCCS : ISD::STRICT_FSETCC, DL,		return getNode(IsSignaling ? ISD::STRICT_FSETCCS : ISD::STRICT_FSETCC, DL,
{VT, MVT::Other}, {Chain, LHS, RHS, getCondCode(Cond)});		{VT, MVT::Other}, {Chain, LHS, RHS, getCondCode(Cond)});
return getNode(ISD::SETCC, DL, VT, LHS, RHS, getCondCode(Cond));		return getNode(ISD::SETCC, DL, VT, LHS, RHS, getCondCode(Cond));
}		}

		/// Helper function to make it easier to build VP_SetCC's if you just have an
		/// ISD::CondCode instead of an SDValue.
		SDValue getVPSetCC(const SDLoc &DL, EVT VT, SDValue LHS, SDValue RHS,
		ISD::CondCode Cond, SDValue Mask, SDValue EVL) {
		assert(LHS.getValueType().isVector() == RHS.getValueType().isVector() &&
		"Cannot compare scalars to vectors");
		assert(LHS.getValueType().isVector() == VT.isVector() &&
		"Cannot compare scalars to vectors");
		assert(Cond != ISD::SETCC_INVALID &&
		"Cannot create a setCC of an invalid node.");
		return getNode(ISD::VP_SETCC, DL, VT, LHS, RHS, getCondCode(Cond), Mask,
		EVL);
		}

/// Helper function to make it easier to build Select's if you just have		/// Helper function to make it easier to build Select's if you just have
/// operands and don't want to check for vector.		/// operands and don't want to check for vector.
SDValue getSelect(const SDLoc &DL, EVT VT, SDValue Cond, SDValue LHS,		SDValue getSelect(const SDLoc &DL, EVT VT, SDValue Cond, SDValue LHS,
SDValue RHS) {		SDValue RHS) {
assert(LHS.getValueType() == RHS.getValueType() &&		assert(LHS.getValueType() == RHS.getValueType() &&
"Cannot use select on differing types");		"Cannot use select on differing types");
assert(VT.isVector() == LHS.getValueType().isVector() &&		assert(VT.isVector() == LHS.getValueType().isVector() &&
"Cannot mix vectors and scalars");		"Cannot mix vectors and scalars");
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	getTruncStore(SDValue Chain, const SDLoc &dl, SDValue Val, SDValue Ptr,
MachinePointerInfo PtrInfo, EVT SVT, unsigned Alignment = 0,		MachinePointerInfo PtrInfo, EVT SVT, unsigned Alignment = 0,
MachineMemOperand::Flags MMOFlags = MachineMemOperand::MONone,		MachineMemOperand::Flags MMOFlags = MachineMemOperand::MONone,
const AAMDNodes &AAInfo = AAMDNodes());		const AAMDNodes &AAInfo = AAMDNodes());
SDValue getTruncStore(SDValue Chain, const SDLoc &dl, SDValue Val,		SDValue getTruncStore(SDValue Chain, const SDLoc &dl, SDValue Val,
SDValue Ptr, EVT SVT, MachineMemOperand *MMO);		SDValue Ptr, EVT SVT, MachineMemOperand *MMO);
SDValue getIndexedStore(SDValue OrigStore, const SDLoc &dl, SDValue Base,		SDValue getIndexedStore(SDValue OrigStore, const SDLoc &dl, SDValue Base,
SDValue Offset, ISD::MemIndexedMode AM);		SDValue Offset, ISD::MemIndexedMode AM);

		/// Returns sum of the base pointer and offset.
		SDValue getLoadVP(EVT VT, const SDLoc &dl, SDValue Chain, SDValue Ptr,
		SDValue Mask, SDValue VLen, EVT MemVT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Mask, SDValue VLen, EVT MemVT, - MachineMemOperand MMO, ISD::LoadExtType); - SDValue getStoreVP(SDValue Chain, const SDLoc &dl, SDValue Val, - SDValue Ptr, SDValue Mask, SDValue VLen, EVT MemVT, - MachineMemOperand MMO, bool IsTruncating = false); + SDValue Mask, SDValue VLen, EVT MemVT, + MachineMemOperand MMO, ISD::LoadExtType); + SDValue getStoreVP(SDValue Chain, const SDLoc &dl, SDValue Val, SDValue Ptr, + SDValue Mask, SDValue VLen, EVT MemVT, + MachineMemOperand MMO, bool IsTruncating = false); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Mask, SDValue VLen…
		MachineMemOperand *MMO, ISD::LoadExtType);
		SDValue getStoreVP(SDValue Chain, const SDLoc &dl, SDValue Val,
		SDValue Ptr, SDValue Mask, SDValue VLen, EVT MemVT,
		MachineMemOperand *MMO, bool IsTruncating = false);
		SDValue getGatherVP(SDVTList VTs, EVT VT, const SDLoc &dl,
		ArrayRef<SDValue> Ops, MachineMemOperand *MMO,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ArrayRef<SDValue> Ops, MachineMemOperand MMO, - ISD::MemIndexType IndexType); + ArrayRef<SDValue> Ops, MachineMemOperand MMO, + ISD::MemIndexType IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ArrayRef<SDValue> Ops…
		ISD::MemIndexType IndexType);
		SDValue getScatterVP(SDVTList VTs, EVT VT, const SDLoc &dl,
		ArrayRef<SDValue> Ops, MachineMemOperand *MMO,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ArrayRef<SDValue> Ops, MachineMemOperand MMO, - ISD::MemIndexType IndexType); + ArrayRef<SDValue> Ops, MachineMemOperand MMO, + ISD::MemIndexType IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ArrayRef<SDValue> Ops…
		ISD::MemIndexType IndexType);

SDValue getMaskedLoad(EVT VT, const SDLoc &dl, SDValue Chain, SDValue Base,		SDValue getMaskedLoad(EVT VT, const SDLoc &dl, SDValue Chain, SDValue Base,
SDValue Offset, SDValue Mask, SDValue Src0, EVT MemVT,		SDValue Offset, SDValue Mask, SDValue Src0, EVT MemVT,
MachineMemOperand *MMO, ISD::MemIndexedMode AM,		MachineMemOperand *MMO, ISD::MemIndexedMode AM,
ISD::LoadExtType, bool IsExpanding = false);		ISD::LoadExtType, bool IsExpanding = false);
SDValue getIndexedMaskedLoad(SDValue OrigLoad, const SDLoc &dl, SDValue Base,		SDValue getIndexedMaskedLoad(SDValue OrigLoad, const SDLoc &dl, SDValue Base,
SDValue Offset, ISD::MemIndexedMode AM);		SDValue Offset, ISD::MemIndexedMode AM);
SDValue getMaskedStore(SDValue Chain, const SDLoc &dl, SDValue Val,		SDValue getMaskedStore(SDValue Chain, const SDLoc &dl, SDValue Val,
SDValue Base, SDValue Offset, SDValue Mask, EVT MemVT,		SDValue Base, SDValue Offset, SDValue Mask, EVT MemVT,
▲ Show 20 Lines • Show All 655 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 544 Lines • ▼ Show 20 Lines	class MemSDNodeBitfields {
uint16_t IsInvariant : 1;		uint16_t IsInvariant : 1;
};		};
enum { NumMemSDNodeBits = NumSDNodeBits + 4 };		enum { NumMemSDNodeBits = NumSDNodeBits + 4 };

class LSBaseSDNodeBitfields {		class LSBaseSDNodeBitfields {
friend class LSBaseSDNode;		friend class LSBaseSDNode;
friend class MaskedLoadStoreSDNode;		friend class MaskedLoadStoreSDNode;
friend class MaskedGatherScatterSDNode;		friend class MaskedGatherScatterSDNode;
		friend class VPGatherScatterSDNode;

uint16_t : NumMemSDNodeBits;		uint16_t : NumMemSDNodeBits;

// This storage is shared between disparate class hierarchies to hold an		// This storage is shared between disparate class hierarchies to hold an
// enumeration specific to the class hierarchy in use.		// enumeration specific to the class hierarchy in use.
// LSBaseSDNode => enum ISD::MemIndexedMode		// LSBaseSDNode => enum ISD::MemIndexedMode
// MaskedLoadStoreBaseSDNode => enum ISD::MemIndexedMode		// MaskedLoadStoreBaseSDNode => enum ISD::MemIndexedMode
// MaskedGatherScatterSDNode => enum ISD::MemIndexType		// MaskedGatherScatterSDNode => enum ISD::MemIndexType
uint16_t AddressingMode : 3;		uint16_t AddressingMode : 3;
};		};
enum { NumLSBaseSDNodeBits = NumMemSDNodeBits + 3 };		enum { NumLSBaseSDNodeBits = NumMemSDNodeBits + 3 };

class LoadSDNodeBitfields {		class LoadSDNodeBitfields {
friend class LoadSDNode;		friend class LoadSDNode;
friend class MaskedLoadSDNode;		friend class MaskedLoadSDNode;
		friend class VPLoadSDNode;

uint16_t : NumLSBaseSDNodeBits;		uint16_t : NumLSBaseSDNodeBits;

uint16_t ExtTy : 2; // enum ISD::LoadExtType		uint16_t ExtTy : 2; // enum ISD::LoadExtType
uint16_t IsExpanding : 1;		uint16_t IsExpanding : 1;
};		};

class StoreSDNodeBitfields {		class StoreSDNodeBitfields {
friend class StoreSDNode;		friend class StoreSDNode;
friend class MaskedStoreSDNode;		friend class MaskedStoreSDNode;
		friend class VPStoreSDNode;

uint16_t : NumLSBaseSDNodeBits;		uint16_t : NumLSBaseSDNodeBits;

uint16_t IsTruncating : 1;		uint16_t IsTruncating : 1;
uint16_t IsCompressing : 1;		uint16_t IsCompressing : 1;
};		};

union {		union {
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	switch (NodeType) {
case ISD::STRICT_FP_TO_FP16:		case ISD::STRICT_FP_TO_FP16:
#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \		#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \
case ISD::STRICT_##DAGN:		case ISD::STRICT_##DAGN:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
return true;		return true;
}		}
}		}

		/// Test whether this is a vector predicated node.
		SjoerdMeijerUnsubmitted Done Reply Inline Actions Perhaps outdated comment? Should it be something along the lines of 'vector predicated node' i.s.o. explicit vector lenght node? SjoerdMeijer: Perhaps outdated comment? Should it be something along the lines of 'vector predicated node' i.
		bool isVP() const {
		switch (NodeType) {
		default:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: - return false; - case ISD::VP_LOAD: - case ISD::VP_STORE: - case ISD::VP_GATHER: - case ISD::VP_SCATTER: - - case ISD::VP_FNEG: - - case ISD::VP_FADD: - case ISD::VP_FMUL: - case ISD::VP_FSUB: - case ISD::VP_FDIV: - case ISD::VP_FREM: - - case ISD::VP_FMA: - - case ISD::VP_ADD: - case ISD::VP_MUL: - case ISD::VP_SUB: - case ISD::VP_SRA: - case ISD::VP_SRL: - case ISD::VP_SHL: - case ISD::VP_UDIV: - case ISD::VP_SDIV: - case ISD::VP_UREM: - case ISD::VP_SREM: - - case ISD::VP_EXPAND: - case ISD::VP_COMPRESS: - case ISD::VP_VSHIFT: - case ISD::VP_SETCC: - case ISD::VP_COMPOSE: - - case ISD::VP_AND: - case ISD::VP_XOR: - case ISD::VP_OR: - - case ISD::VP_REDUCE_ADD: - case ISD::VP_REDUCE_SMIN: - case ISD::VP_REDUCE_SMAX: - case ISD::VP_REDUCE_UMIN: - case ISD::VP_REDUCE_UMAX: - - case ISD::VP_REDUCE_MUL: - case ISD::VP_REDUCE_AND: - case ISD::VP_REDUCE_OR: - case ISD::VP_REDUCE_FADD: - case ISD::VP_REDUCE_FMUL: - case ISD::VP_REDUCE_FMIN: - case ISD::VP_REDUCE_FMAX: + default: + return false; + case ISD::VP_LOAD: + case ISD::VP_STORE: + case ISD::VP_GATHER: + case ISD::VP_SCATTER: + + case ISD::VP_FNEG: + + case ISD::VP_FADD: + case ISD::VP_FMUL: + case ISD::VP_FSUB: + case ISD::VP_FDIV: + case ISD::VP_FREM: + + case ISD::VP_FMA: + + case ISD::VP_ADD: + case ISD::VP_MUL: + case ISD::VP_SUB: + case ISD::VP_SRA: + case ISD::VP_SRL: + case ISD::VP_SHL: + case ISD::VP_UDIV: + case ISD::VP_SDIV: + case ISD::VP_UREM: + case ISD::VP_SREM: + + case ISD::VP_EXPAND: + case ISD::VP_COMPRESS: + case ISD::VP_VSHIFT: + case ISD::VP_SETCC: + case ISD::VP_COMPOSE: + + case ISD::VP_AND: + case ISD::VP_XOR: + case ISD::VP_OR: + + case ISD::VP_REDUCE_ADD: + case ISD::VP_REDUCE_SMIN: + case ISD::VP_REDUCE_SMAX: + case ISD::VP_REDUCE_UMIN: + case ISD::VP_REDUCE_UMAX: + + case ISD::VP_REDUCE_MUL: + case ISD::VP_REDUCE_AND: + case ISD::VP_REDUCE_OR: + case ISD::VP_REDUCE_FADD: + case ISD::VP_REDUCE_FMUL: + case ISD::VP_REDUCE_FMIN: + case ISD::VP_REDUCE_FMAX: Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: - return false; - case…
		return false;
		case ISD::VP_LOAD:
		case ISD::VP_STORE:
		case ISD::VP_GATHER:
		case ISD::VP_SCATTER:

		case ISD::VP_FNEG:

		case ISD::VP_FADD:
		case ISD::VP_FMUL:
		case ISD::VP_FSUB:
		case ISD::VP_FDIV:
		case ISD::VP_FREM:

		case ISD::VP_FMA:

		case ISD::VP_ADD:
		case ISD::VP_MUL:
		case ISD::VP_SUB:
		case ISD::VP_SRA:
		case ISD::VP_SRL:
		case ISD::VP_SHL:
		case ISD::VP_UDIV:
		case ISD::VP_SDIV:
		case ISD::VP_UREM:
		case ISD::VP_SREM:

		case ISD::VP_EXPAND:
		case ISD::VP_COMPRESS:
		case ISD::VP_VSHIFT:
		case ISD::VP_SETCC:
		case ISD::VP_COMPOSE:

		case ISD::VP_AND:
		case ISD::VP_XOR:
		case ISD::VP_OR:

		case ISD::VP_REDUCE_ADD:
		case ISD::VP_REDUCE_SMIN:
		case ISD::VP_REDUCE_SMAX:
		case ISD::VP_REDUCE_UMIN:
		case ISD::VP_REDUCE_UMAX:

		case ISD::VP_REDUCE_MUL:
		case ISD::VP_REDUCE_AND:
		case ISD::VP_REDUCE_OR:
		case ISD::VP_REDUCE_FADD:
		case ISD::VP_REDUCE_FMUL:
		case ISD::VP_REDUCE_FMIN:
		case ISD::VP_REDUCE_FMAX:

		return true;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return true; + return true; Lint: Pre-merge checks: clang-format: please reformat the code ``` - return true; + return true; ```
		}
		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
/// Test if this node has a post-isel opcode, directly		/// Test if this node has a post-isel opcode, directly
/// corresponding to a MachineInstr opcode.		/// corresponding to a MachineInstr opcode.
bool isMachineOpcode() const { return NodeType < 0; }		bool isMachineOpcode() const { return NodeType < 0; }

/// This may only be called if isMachineOpcode returns		/// This may only be called if isMachineOpcode returns
/// true. It returns the MachineInstr opcode value that the node's opcode		/// true. It returns the MachineInstr opcode value that the node's opcode
/// corresponds to.		/// corresponds to.
unsigned getMachineOpcode() const {		unsigned getMachineOpcode() const {
▲ Show 20 Lines • Show All 667 Lines • ▼ Show 20 Lines	public:
const SDValue &getBasePtr() const {		const SDValue &getBasePtr() const {
return getOperand(getOpcode() == ISD::STORE ? 2 : 1);		return getOperand(getOpcode() == ISD::STORE ? 2 : 1);
}		}

// Methods to support isa and dyn_cast		// Methods to support isa and dyn_cast
static bool classof(const SDNode *N) {		static bool classof(const SDNode *N) {
// For some targets, we lower some target intrinsics to a MemIntrinsicNode		// For some targets, we lower some target intrinsics to a MemIntrinsicNode
// with either an intrinsic or a target opcode.		// with either an intrinsic or a target opcode.
return N->getOpcode() == ISD::LOAD \|\|		return N->getOpcode() == ISD::LOAD \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return N->getOpcode() == ISD::LOAD \|\| - N->getOpcode() == ISD::STORE \|\| - N->getOpcode() == ISD::PREFETCH \|\| - N->getOpcode() == ISD::ATOMIC_CMP_SWAP \|\| + return N->getOpcode() == ISD::LOAD \|\| N->getOpcode() == ISD::STORE \|\| + N->getOpcode() == ISD::PREFETCH \|\| + N->getOpcode() == ISD::ATOMIC_CMP_SWAP \|\| Lint: Pre-merge checks: clang-format: please reformat the code ``` - return N->getOpcode() == ISD::LOAD…
N->getOpcode() == ISD::STORE \|\|		N->getOpcode() == ISD::STORE \|\|
N->getOpcode() == ISD::PREFETCH \|\|		N->getOpcode() == ISD::PREFETCH \|\|
N->getOpcode() == ISD::ATOMIC_CMP_SWAP \|\|		N->getOpcode() == ISD::ATOMIC_CMP_SWAP \|\|
N->getOpcode() == ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS \|\|		N->getOpcode() == ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS \|\|
N->getOpcode() == ISD::ATOMIC_SWAP \|\|		N->getOpcode() == ISD::ATOMIC_SWAP \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - N->getOpcode() == ISD::ATOMIC_SWAP \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_ADD \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_SUB \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_AND \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_CLR \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_OR \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_XOR \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_NAND \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_MIN \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_MAX \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_FADD \|\| - N->getOpcode() == ISD::ATOMIC_LOAD_FSUB \|\| - N->getOpcode() == ISD::ATOMIC_LOAD \|\| - N->getOpcode() == ISD::ATOMIC_STORE \|\| - N->getOpcode() == ISD::MLOAD \|\| - N->getOpcode() == ISD::MSTORE \|\| - N->getOpcode() == ISD::MGATHER \|\| - N->getOpcode() == ISD::MSCATTER \|\| - N->getOpcode() == ISD::VP_LOAD \|\| - N->getOpcode() == ISD::VP_STORE \|\| - N->getOpcode() == ISD::VP_GATHER \|\| - N->getOpcode() == ISD::VP_SCATTER \|\| - N->isMemIntrinsic() \|\| + N->getOpcode() == ISD::ATOMIC_SWAP \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_ADD \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_SUB \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_AND \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_CLR \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_OR \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_XOR \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_NAND \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_MIN \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_MAX \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_UMIN \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_UMAX \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_FADD \|\| + N->getOpcode() == ISD::ATOMIC_LOAD_FSUB \|\| + N->getOpcode() == ISD::ATOMIC_LOAD \|\| + N->getOpcode() == ISD::ATOMIC_STORE \|\| + N->getOpcode() == ISD::MLOAD \|\| N->getOpcode() == ISD::MSTORE \|\| + N->getOpcode() == ISD::MGATHER \|\| N->getOpcode() == ISD::MSCATTER \|\| + N->getOpcode() == ISD::VP_LOAD \|\| N->getOpcode() == ISD::VP_STORE \|\| + N->getOpcode() == ISD::VP_GATHER \|\| + N->getOpcode() == ISD::VP_SCATTER \|\| N->isMemIntrinsic() \|\| Lint: Pre-merge checks: clang-format: please reformat the code ``` - N->getOpcode() == ISD::ATOMIC_SWAP…
N->getOpcode() == ISD::ATOMIC_LOAD_ADD \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_ADD \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_SUB \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_SUB \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_AND \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_AND \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_CLR \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_CLR \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_OR \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_OR \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_XOR \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_XOR \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_NAND \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_NAND \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_MIN \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_MIN \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_MAX \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_MAX \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_UMIN \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_UMIN \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_UMAX \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_UMAX \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_FADD \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_FADD \|\|
N->getOpcode() == ISD::ATOMIC_LOAD_FSUB \|\|		N->getOpcode() == ISD::ATOMIC_LOAD_FSUB \|\|
N->getOpcode() == ISD::ATOMIC_LOAD \|\|		N->getOpcode() == ISD::ATOMIC_LOAD \|\|
N->getOpcode() == ISD::ATOMIC_STORE \|\|		N->getOpcode() == ISD::ATOMIC_STORE \|\|
N->getOpcode() == ISD::MLOAD \|\|		N->getOpcode() == ISD::MLOAD \|\|
N->getOpcode() == ISD::MSTORE \|\|		N->getOpcode() == ISD::MSTORE \|\|
N->getOpcode() == ISD::MGATHER \|\|		N->getOpcode() == ISD::MGATHER \|\|
N->getOpcode() == ISD::MSCATTER \|\|		N->getOpcode() == ISD::MSCATTER \|\|
		N->getOpcode() == ISD::VP_LOAD \|\|
		SjoerdMeijerUnsubmitted Done Reply Inline Actions indentation of `\|\|` off by 1? SjoerdMeijer: indentation of `\|\|` off by 1?
		N->getOpcode() == ISD::VP_STORE \|\|
		N->getOpcode() == ISD::VP_GATHER \|\|
		N->getOpcode() == ISD::VP_SCATTER \|\|
N->isMemIntrinsic() \|\|		N->isMemIntrinsic() \|\|
N->isTargetMemoryOpcode();		N->isTargetMemoryOpcode();
}		}
};		};

/// This is an SDNode representing atomic operations.		/// This is an SDNode representing atomic operations.
class AtomicSDNode : public MemSDNode {		class AtomicSDNode : public MemSDNode {
public:		public:
▲ Show 20 Lines • Show All 845 Lines • ▼ Show 20 Lines	public:
const SDValue &getBasePtr() const { return getOperand(2); }		const SDValue &getBasePtr() const { return getOperand(2); }
const SDValue &getOffset() const { return getOperand(3); }		const SDValue &getOffset() const { return getOperand(3); }

static bool classof(const SDNode *N) {		static bool classof(const SDNode *N) {
return N->getOpcode() == ISD::STORE;		return N->getOpcode() == ISD::STORE;
}		}
};		};

		/// This base class is used to represent VP_LOAD and VP_STORE nodes
		SjoerdMeijerUnsubmitted Done Reply Inline Actions `VP_LOAD` and `VP_STORE`? SjoerdMeijer: `VP_LOAD` and `VP_STORE`?
		class VPLoadStoreSDNode : public MemSDNode {
		public:
		friend class SelectionDAG;

		VPLoadStoreSDNode(ISD::NodeType NodeTy, unsigned Order,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPLoadStoreSDNode(ISD::NodeType NodeTy, unsigned Order, - const DebugLoc &dl, SDVTList VTs, EVT MemVT, - MachineMemOperand MMO) + VPLoadStoreSDNode(ISD::NodeType NodeTy, unsigned Order, const DebugLoc &dl, + SDVTList VTs, EVT MemVT, MachineMemOperand MMO) Lint: Pre-merge checks: clang-format: please reformat the code ``` - VPLoadStoreSDNode(ISD::NodeType NodeTy, unsigned…
		const DebugLoc &dl, SDVTList VTs, EVT MemVT,
		MachineMemOperand *MMO)
		: MemSDNode(NodeTy, Order, dl, VTs, MemVT, MMO) {}

		// VPLoadSDNode (Chain, ptr, mask, VLen)
		// VPStoreSDNode (Chain, data, ptr, mask, VLen)
		// Mask is a vector of i1 elements, Vlen is i32
		const SDValue &getBasePtr() const {
		return getOperand(getOpcode() == ISD::VP_LOAD ? 1 : 2);
		}
		const SDValue &getMask() const {
		return getOperand(getOpcode() == ISD::VP_LOAD ? 2 : 3);
		}
		const SDValue &getVectorLength() const {
		return getOperand(getOpcode() == ISD::VP_LOAD ? 3 : 4);
		}

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_LOAD \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return N->getOpcode() == ISD::VP_LOAD \|\| - N->getOpcode() == ISD::VP_STORE; + return N->getOpcode() == ISD::VP_LOAD \|\| N->getOpcode() == ISD::VP_STORE; Lint: Pre-merge checks: clang-format: please reformat the code ``` - return N->getOpcode() == ISD::VP_LOAD \|\|…
		N->getOpcode() == ISD::VP_STORE;
		}
		};

		/// This class is used to represent a VP_LOAD node
		SjoerdMeijerUnsubmitted Done Reply Inline Actions same? SjoerdMeijer: same?
		class VPLoadSDNode : public VPLoadStoreSDNode {
		public:
		friend class SelectionDAG;

		VPLoadSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
		ISD::LoadExtType ETy, EVT MemVT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ISD::LoadExtType ETy, EVT MemVT, - MachineMemOperand MMO) + ISD::LoadExtType ETy, EVT MemVT, MachineMemOperand MMO) Lint: Pre-merge checks: clang-format: please reformat the code ``` - ISD::LoadExtType ETy, EVT MemVT…
		MachineMemOperand *MMO)
		: VPLoadStoreSDNode(ISD::VP_LOAD, Order, dl, VTs, MemVT, MMO) {
		LoadSDNodeBits.ExtTy = ETy;
		LoadSDNodeBits.IsExpanding = false;
		}

		ISD::LoadExtType getExtensionType() const {
		return static_cast<ISD::LoadExtType>(LoadSDNodeBits.ExtTy);
		}

		const SDValue &getBasePtr() const { return getOperand(1); }
		const SDValue &getMask() const { return getOperand(2); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SDValue &getMask() const { return getOperand(2); } + const SDValue &getMask() const { return getOperand(2); } Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SDValue &getMask() const { return…
		const SDValue &getVectorLength() const { return getOperand(3); }

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_LOAD;
		}
		bool isExpandingLoad() const { return LoadSDNodeBits.IsExpanding; }
		};

		/// This class is used to represent a VP_STORE node
		class VPStoreSDNode : public VPLoadStoreSDNode {
		public:
		friend class SelectionDAG;

		VPStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, - bool isTrunc, EVT MemVT, - MachineMemOperand MMO) + VPStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, bool isTrunc, + EVT MemVT, MachineMemOperand MMO) Lint: Pre-merge checks: clang-format: please reformat the code ``` - VPStoreSDNode(unsigned Order, const DebugLoc &dl…
		bool isTrunc, EVT MemVT,
		MachineMemOperand *MMO)
		: VPLoadStoreSDNode(ISD::VP_STORE, Order, dl, VTs, MemVT, MMO) {
		StoreSDNodeBits.IsTruncating = isTrunc;
		StoreSDNodeBits.IsCompressing = false;
		}

		/// Return true if this is a truncating store.
		SjoerdMeijerUnsubmitted Done Reply Inline Actions `.. does a truncation before store` sounds a bit odd. Since 'truncating store' is a well known term, and that you explain what it is for ints/floats below, I think it suffices to say "Return true if this is truncating store. For intergers ..." SjoerdMeijer: `.. does a truncation before store` sounds a bit odd. Since 'truncating store' is a well known…
		/// For integers this is the same as doing a TRUNCATE and storing the result.
		/// For floats, it is the same as doing an FP_ROUND and storing the result.
		bool isTruncatingStore() const { return StoreSDNodeBits.IsTruncating; }

		/// Returns true if the op does a compression to the vector before storing.
		/// The node contiguously stores the active elements (integers or floats)
		/// in src (those with their respective bit set in writemask k) to unaligned
		/// memory at base_addr.
		bool isCompressingStore() const { return StoreSDNodeBits.IsCompressing; }

		const SDValue &getValue() const { return getOperand(1); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SDValue &getValue() const { return getOperand(1); } + const SDValue &getValue() const { return getOperand(1); } Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SDValue &getValue() const { return…
		const SDValue &getBasePtr() const { return getOperand(2); }
		const SDValue &getMask() const { return getOperand(3); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SDValue &getMask() const { return getOperand(3); } - const SDValue &getVectorLength() const { return getOperand(4); } + const SDValue &getMask() const { return getOperand(3); } + const SDValue &getVectorLength() const { return getOperand(4); } Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SDValue &getMask() const { return…
		const SDValue &getVectorLength() const { return getOperand(4); }

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_STORE;
		}
		};

/// This base class is used to represent MLOAD and MSTORE nodes		/// This base class is used to represent MLOAD and MSTORE nodes
class MaskedLoadStoreSDNode : public MemSDNode {		class MaskedLoadStoreSDNode : public MemSDNode {
public:		public:
friend class SelectionDAG;		friend class SelectionDAG;

MaskedLoadStoreSDNode(ISD::NodeType NodeTy, unsigned Order,		MaskedLoadStoreSDNode(ISD::NodeType NodeTy, unsigned Order,
const DebugLoc &dl, SDVTList VTs,		const DebugLoc &dl, SDVTList VTs,
ISD::MemIndexedMode AM, EVT MemVT,		ISD::MemIndexedMode AM, EVT MemVT,
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	public:
const SDValue &getMask() const { return getOperand(4); }		const SDValue &getMask() const { return getOperand(4); }

static bool classof(const SDNode *N) {		static bool classof(const SDNode *N) {
return N->getOpcode() == ISD::MSTORE;		return N->getOpcode() == ISD::MSTORE;
}		}
};		};

/// This is a base class used to represent		/// This is a base class used to represent
		/// VP_GATHER and VP_SCATTER nodes
		///
		class VPGatherScatterSDNode : public MemSDNode {
		public:
		friend class SelectionDAG;

		VPGatherScatterSDNode(ISD::NodeType NodeTy, unsigned Order,
		const DebugLoc &dl, SDVTList VTs, EVT MemVT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const DebugLoc &dl, SDVTList VTs, EVT MemVT, - MachineMemOperand MMO, ISD::MemIndexType IndexType) + const DebugLoc &dl, SDVTList VTs, EVT MemVT, + MachineMemOperand MMO, ISD::MemIndexType IndexType) Lint: Pre-merge checks: clang-format: please reformat the code ``` - const DebugLoc &dl…
		MachineMemOperand *MMO, ISD::MemIndexType IndexType)
		: MemSDNode(NodeTy, Order, dl, VTs, MemVT, MMO) {
		LSBaseSDNodeBits.AddressingMode = IndexType;
		assert(getIndexType() == IndexType && "Value truncated");
		}

		/// How is Index applied to BasePtr when computing addresses.
		ISD::MemIndexType getIndexType() const {
		return static_cast<ISD::MemIndexType>(LSBaseSDNodeBits.AddressingMode);
		}
		bool isIndexScaled() const {
		return (getIndexType() == ISD::SIGNED_SCALED) \|\|
		(getIndexType() == ISD::UNSIGNED_SCALED);
		}
		bool isIndexSigned() const {
		return (getIndexType() == ISD::SIGNED_SCALED) \|\|
		(getIndexType() == ISD::SIGNED_UNSCALED);
		}

		// In the both nodes address is Op1, mask is Op2:
		// VPGatherSDNode (Chain, base, index, scale, mask, vlen)
		// VPScatterSDNode (Chain, value, base, index, scale, mask, vlen)
		// Mask is a vector of i1 elements
		const SDValue &getBasePtr() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 1 : 2); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SDValue &getBasePtr() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 1 : 2); } - const SDValue &getIndex() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 2 : 3); } - const SDValue &getScale() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 3 : 4); } - const SDValue &getMask() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 4 : 5); } - const SDValue &getVectorLength() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 5 : 6); } + const SDValue &getBasePtr() const { + return getOperand((getOpcode() == ISD::VP_GATHER) ? 1 : 2); + } + const SDValue &getIndex() const { + return getOperand((getOpcode() == ISD::VP_GATHER) ? 2 : 3); + } + const SDValue &getScale() const { + return getOperand((getOpcode() == ISD::VP_GATHER) ? 3 : 4); + } + const SDValue &getMask() const { + return getOperand((getOpcode() == ISD::VP_GATHER) ? 4 : 5); + } + const SDValue &getVectorLength() const { + return getOperand((getOpcode() == ISD::VP_GATHER) ? 5 : 6); + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SDValue &getBasePtr() const { return…
		const SDValue &getIndex() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 2 : 3); }
		const SDValue &getScale() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 3 : 4); }
		const SDValue &getMask() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 4 : 5); }
		const SDValue &getVectorLength() const { return getOperand((getOpcode() == ISD::VP_GATHER) ? 5 : 6); }

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_GATHER \|\|
		N->getOpcode() == ISD::VP_SCATTER;
		}
		};

		/// This class is used to represent an VP_GATHER node
		///
		class VPGatherSDNode : public VPGatherScatterSDNode {
		public:
		friend class SelectionDAG;

		VPGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, - EVT MemVT, MachineMemOperand MMO, - ISD::MemIndexType IndexType) - : VPGatherScatterSDNode(ISD::VP_GATHER, Order, dl, VTs, MemVT, MMO, IndexType) {} + VPGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, + MachineMemOperand MMO, ISD::MemIndexType IndexType) + : VPGatherScatterSDNode(ISD::VP_GATHER, Order, dl, VTs, MemVT, MMO, + IndexType) {} Lint: Pre-merge checks: clang-format: please reformat the code ``` - VPGatherSDNode(unsigned Order, const DebugLoc &dl…
		EVT MemVT, MachineMemOperand *MMO,
		ISD::MemIndexType IndexType)
		: VPGatherScatterSDNode(ISD::VP_GATHER, Order, dl, VTs, MemVT, MMO, IndexType) {}

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_GATHER;
		}
		};

		/// This class is used to represent an VP_SCATTER node
		///
		class VPScatterSDNode : public VPGatherScatterSDNode {
		public:
		friend class SelectionDAG;

		VPScatterSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPScatterSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, - EVT MemVT, MachineMemOperand MMO, - ISD::MemIndexType IndexType) - : VPGatherScatterSDNode(ISD::VP_SCATTER, Order, dl, VTs, MemVT, MMO, IndexType) {} + VPScatterSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, + MachineMemOperand MMO, ISD::MemIndexType IndexType) + : VPGatherScatterSDNode(ISD::VP_SCATTER, Order, dl, VTs, MemVT, MMO, + IndexType) {} Lint: Pre-merge checks: clang-format: please reformat the code ``` - VPScatterSDNode(unsigned Order, const DebugLoc…
		EVT MemVT, MachineMemOperand *MMO,
		ISD::MemIndexType IndexType)
		: VPGatherScatterSDNode(ISD::VP_SCATTER, Order, dl, VTs, MemVT, MMO, IndexType) {}

		const SDValue &getValue() const { return getOperand(1); }

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::VP_SCATTER;
		}
		};


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
		/// This is a base class used to represent
/// MGATHER and MSCATTER nodes		/// MGATHER and MSCATTER nodes
///		///
class MaskedGatherScatterSDNode : public MemSDNode {		class MaskedGatherScatterSDNode : public MemSDNode {
public:		public:
friend class SelectionDAG;		friend class SelectionDAG;

MaskedGatherScatterSDNode(ISD::NodeType NodeTy, unsigned Order,		MaskedGatherScatterSDNode(ISD::NodeType NodeTy, unsigned Order,
const DebugLoc &dl, SDVTList VTs, EVT MemVT,		const DebugLoc &dl, SDVTList VTs, EVT MemVT,
▲ Show 20 Lines • Show All 295 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Attributes.td

	Show First 20 Lines • Show All 133 Lines • ▼ Show 20 Lines
	def ReadOnly : EnumAttr<"readonly">;			def ReadOnly : EnumAttr<"readonly">;

	/// Return value is always equal to this argument.			/// Return value is always equal to this argument.
	def Returned : EnumAttr<"returned">;			def Returned : EnumAttr<"returned">;

	/// Parameter is required to be a trivial constant.			/// Parameter is required to be a trivial constant.
	def ImmArg : EnumAttr<"immarg">;			def ImmArg : EnumAttr<"immarg">;

				/// Return value that is equal to this argument on enabled lanes (mask).
				def Passthru : EnumAttr<"passthru">;

				/// Mask argument that applies to this function.
				def Mask : EnumAttr<"mask">;

				/// Dynamic Vector Length argument of this function.
				def VectorLength : EnumAttr<"vlen">;

	/// Function can return twice.			/// Function can return twice.
	def ReturnsTwice : EnumAttr<"returns_twice">;			def ReturnsTwice : EnumAttr<"returns_twice">;

	/// Safe Stack protection.			/// Safe Stack protection.
	def SafeStack : EnumAttr<"safestack">;			def SafeStack : EnumAttr<"safestack">;

	/// Shadow Call Stack protection.			/// Shadow Call Stack protection.
	def ShadowCallStack : EnumAttr<"shadowcallstack">;			def ShadowCallStack : EnumAttr<"shadowcallstack">;
	▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

llvm/include/llvm/IR/FPEnv.h

	Show All 15 Lines
	#define LLVM_IR_FLOATINGPOINT_H			#define LLVM_IR_FLOATINGPOINT_H

	#include "llvm/ADT/Optional.h"			#include "llvm/ADT/Optional.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include <stdint.h>			#include <stdint.h>

	namespace llvm {			namespace llvm {

				class LLVMContext;
				class Value;

	namespace fp {			namespace fp {

	/// Rounding mode used for floating point operations.			/// Rounding mode used for floating point operations.
	///			///
	/// Each of these values correspond to some metadata argument value of a			/// Each of these values correspond to some metadata argument value of a
	/// constrained floating point intrinsic. See the LLVM Language Reference Manual			/// constrained floating point intrinsic. See the LLVM Language Reference Manual
	/// for details.			/// for details.
	enum RoundingMode : uint8_t {			enum RoundingMode : uint8_t {
	Show All 10 Lines
	/// constrained floating point intrinsic. See the LLVM Language Reference Manual			/// constrained floating point intrinsic. See the LLVM Language Reference Manual
	/// for details.			/// for details.
	enum ExceptionBehavior : uint8_t {			enum ExceptionBehavior : uint8_t {
	ebIgnore, ///< This corresponds to "fpexcept.ignore".			ebIgnore, ///< This corresponds to "fpexcept.ignore".
	ebMayTrap, ///< This corresponds to "fpexcept.maytrap".			ebMayTrap, ///< This corresponds to "fpexcept.maytrap".
	ebStrict ///< This corresponds to "fpexcept.strict".			ebStrict ///< This corresponds to "fpexcept.strict".
	};			};

	}			} // namespace fp

	/// Returns a valid RoundingMode enumerator when given a string			/// Returns a valid RoundingMode enumerator when given a string
	/// that is valid as input in constrained intrinsic rounding mode			/// that is valid as input in constrained intrinsic rounding mode
	/// metadata.			/// metadata.
	Optional<fp::RoundingMode> StrToRoundingMode(StringRef);			Optional<fp::RoundingMode> StrToRoundingMode(StringRef);

	/// For any RoundingMode enumerator, returns a string valid as input in			/// For any RoundingMode enumerator, returns a string valid as input in
	/// constrained intrinsic rounding mode metadata.			/// constrained intrinsic rounding mode metadata.
	Optional<StringRef> RoundingModeToStr(fp::RoundingMode);			Optional<StringRef> RoundingModeToStr(fp::RoundingMode);

	/// Returns a valid ExceptionBehavior enumerator when given a string			/// Returns a valid ExceptionBehavior enumerator when given a string
	/// valid as input in constrained intrinsic exception behavior metadata.			/// valid as input in constrained intrinsic exception behavior metadata.
	Optional<fp::ExceptionBehavior> StrToExceptionBehavior(StringRef);			Optional<fp::ExceptionBehavior> StrToExceptionBehavior(StringRef);

	/// For any ExceptionBehavior enumerator, returns a string valid as			/// For any ExceptionBehavior enumerator, returns a string valid as
	/// input in constrained intrinsic exception behavior metadata.			/// input in constrained intrinsic exception behavior metadata.
	Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);			Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);

	}			/// Return the IR Value representation of any ExceptionBehavior.
				Value *GetConstrainedFPExcept(LLVMContext &, fp::ExceptionBehavior);

				/// Return the IR Value representation of any RoundingMode.
				Value *GetConstrainedFPRounding(LLVMContext &, fp::RoundingMode);

				} // namespace llvm
	#endif			#endif

llvm/include/llvm/IR/IRBuilder.h

Show All 23 Lines
#include "llvm/IR/ConstantFolder.h"		#include "llvm/IR/ConstantFolder.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
		#include "llvm/IR/IntrinsicInst.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/IntrinsicInst.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/IntrinsicInst.h" ```
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
▲ Show 20 Lines • Show All 810 Lines • ▼ Show 20 Lines	public:
/// Create a call to Masked Scatter intrinsic		/// Create a call to Masked Scatter intrinsic
CallInst CreateMaskedScatter(Value Val, Value *Ptrs, Align Alignment,		CallInst CreateMaskedScatter(Value Val, Value *Ptrs, Align Alignment,
Value *Mask = nullptr);		Value *Mask = nullptr);

/// Create an assume intrinsic call that allows the optimizer to		/// Create an assume intrinsic call that allows the optimizer to
/// assume that the provided condition will be true.		/// assume that the provided condition will be true.
CallInst CreateAssumption(Value Cond);		CallInst CreateAssumption(Value Cond);

		/// Call an arithmetic VP intrinsic.
		Instruction CreateVectorPredicatedInst(unsigned OC, ArrayRef<Value >,
		Instruction *FMFSource = nullptr,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Instruction FMFSource = nullptr, - const Twine &Name = ""); + Instruction FMFSource = nullptr, + const Twine &Name = ""); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Instruction *FMFSource…
		const Twine &Name = "");

		/// Call an comparison VP intrinsic.
		Instruction *CreateVectorPredicatedCmp(CmpInst::Predicate Pred,
		Value FirstOp, Value SndOp, Value *Mask,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value FirstOp, Value SndOp, Value Mask, - Value VectorLength, - const Twine &Name = ""); + Value FirstOp, Value SndOp, + Value Mask, Value VectorLength, + const Twine &Name = ""); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value *FirstOp…
		Value *VectorLength,
		const Twine &Name = "");

		/// Call an comparison VP intrinsic.
		Instruction *CreateVectorPredicatedReduce(Module &M, CmpInst::Predicate Pred,
		Value FirstOp, Value SndOp, Value *Mask,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value FirstOp, Value SndOp, Value Mask, - Value VectorLength, - const Twine &Name = ""); + Value FirstOp, Value SndOp, + Value Mask, Value VectorLength, + const Twine &Name = ""); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value *FirstOp…
		Value *VectorLength,
		const Twine &Name = "");

/// Create a call to the experimental.gc.statepoint intrinsic to		/// Create a call to the experimental.gc.statepoint intrinsic to
/// start a new statepoint sequence.		/// start a new statepoint sequence.
CallInst *CreateGCStatepointCall(uint64_t ID, uint32_t NumPatchBytes,		CallInst *CreateGCStatepointCall(uint64_t ID, uint32_t NumPatchBytes,
Value *ActualCallee,		Value *ActualCallee,
ArrayRef<Value *> CallArgs,		ArrayRef<Value *> CallArgs,
ArrayRef<Value *> DeoptArgs,		ArrayRef<Value *> DeoptArgs,
ArrayRef<Value *> GCArgs,		ArrayRef<Value *> GCArgs,
const Twine &Name = "");		const Twine &Name = "");
▲ Show 20 Lines • Show All 365 Lines • ▼ Show 20 Lines	private:
}		}

Value *getConstrainedFPExcept(Optional<fp::ExceptionBehavior> Except) {		Value *getConstrainedFPExcept(Optional<fp::ExceptionBehavior> Except) {
fp::ExceptionBehavior UseExcept = DefaultConstrainedExcept;		fp::ExceptionBehavior UseExcept = DefaultConstrainedExcept;

if (Except.hasValue())		if (Except.hasValue())
UseExcept = Except.getValue();		UseExcept = Except.getValue();

Optional<StringRef> ExceptStr = ExceptionBehaviorToStr(UseExcept);		return GetConstrainedFPExcept(Context, UseExcept);
assert(ExceptStr.hasValue() && "Garbage strict exception behavior!");
auto *ExceptMDS = MDString::get(Context, ExceptStr.getValue());

return MetadataAsValue::get(Context, ExceptMDS);
}		}

Value *getConstrainedFPPredicate(CmpInst::Predicate Predicate) {		Value *getConstrainedFPPredicate(CmpInst::Predicate Predicate) {
assert(CmpInst::isFPPredicate(Predicate) &&		assert(CmpInst::isFPPredicate(Predicate) &&
Predicate != CmpInst::FCMP_FALSE &&		Predicate != CmpInst::FCMP_FALSE &&
Predicate != CmpInst::FCMP_TRUE &&		Predicate != CmpInst::FCMP_TRUE &&
"Invalid constrained FP comparison predicate!");		"Invalid constrained FP comparison predicate!");

▲ Show 20 Lines • Show All 1,783 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	static bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::dbg_label;		return I->getIntrinsicID() == Intrinsic::dbg_label;
}		}
static bool classof(const Value *V) {		static bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
/// @}		/// @}
};		};

		/// This is the common base class for vector predication intrinsics.
		class VPIntrinsic : public IntrinsicInst {
		public:
		enum class VPTypeToken : int8_t {
		Returned = 0, // vectorized return type.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Returned = 0, // vectorized return type. - Vector = 1, // vector operand type - Pointer = 2, // vector pointer-operand type (memory op) - Mask = 3 // vector mask type + Returned = 0, // vectorized return type. + Vector = 1, // vector operand type + Pointer = 2, // vector pointer-operand type (memory op) + Mask = 3 // vector mask type Lint: Pre-merge checks: clang-format: please reformat the code ``` - Returned = 0, // vectorized return type.
		Vector = 1, // vector operand type
		Pointer = 2, // vector pointer-operand type (memory op)
		Mask = 3 // vector mask type
		};

		using TypeTokenVec = SmallVector<VPTypeToken, 4>;
		using ShortTypeVec = SmallVector<Type *, 4>;

		/// \brief Declares a llvm.vp.* intrinsic in \p M that matches the parameters \p Params.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// \brief Declares a llvm.vp.* intrinsic in \p M that matches the parameters \p Params. - static Function* GetDeclarationForParams(Module M, Intrinsic::ID, ArrayRef<Value > Params, Type* VecRetTy = nullptr); + /// \brief Declares a llvm.vp.* intrinsic in \p M that matches the + /// parameters \p Params. + static Function GetDeclarationForParams(Module M, Intrinsic::ID, + ArrayRef<Value > Params, + Type VecRetTy = nullptr); Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// \brief Declares a llvm.vp.* intrinsic in \p…
		static Function* GetDeclarationForParams(Module M, Intrinsic::ID, ArrayRef<Value > Params, Type* VecRetTy = nullptr);

		// Type tokens required to instantiate this intrinsic.
		static TypeTokenVec GetTypeTokens(Intrinsic::ID);

		// whether the intrinsic has a rounding mode parameter (regardless of
		// setting).
		static bool HasRoundingModeParam(Intrinsic::ID VPID) { return GetRoundingModeParamPos(VPID) != None; }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - static bool HasRoundingModeParam(Intrinsic::ID VPID) { return GetRoundingModeParamPos(VPID) != None; } + static bool HasRoundingModeParam(Intrinsic::ID VPID) { + return GetRoundingModeParamPos(VPID) != None; + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - static bool HasRoundingModeParam(Intrinsic::ID…
		// whether the intrinsic has a exception behavior parameter (regardless of
		// setting).
		static bool HasExceptionBehaviorParam(Intrinsic::ID VPID) { return GetExceptionBehaviorParamPos(VPID) != None; }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - static bool HasExceptionBehaviorParam(Intrinsic::ID VPID) { return GetExceptionBehaviorParamPos(VPID) != None; } + static bool HasExceptionBehaviorParam(Intrinsic::ID VPID) { + return GetExceptionBehaviorParamPos(VPID) != None; + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - static bool HasExceptionBehaviorParam(Intrinsic…
		static Optional<int> GetMaskParamPos(Intrinsic::ID IntrinsicID);
		static Optional<int> GetVectorLengthParamPos(Intrinsic::ID IntrinsicID);
		static Optional<int>
		GetExceptionBehaviorParamPos(Intrinsic::ID IntrinsicID);
		static Optional<int> GetRoundingModeParamPos(Intrinsic::ID IntrinsicID);
		// the llvm.vp.* intrinsic for this other kind of intrinsic.
		static Intrinsic::ID GetForIntrinsic(Intrinsic::ID IntrinsicID);
		static Intrinsic::ID GetForOpcode(unsigned OC);

		// Whether \p ID is a VP intrinsic ID.
		static bool IsVPIntrinsic(Intrinsic::ID);

		/// TODO make this private!
		/// \brief Generate the disambiguating type vec for this VP Intrinsic.
		/// \returns A disamguating type vector to instantiate this intrinsic.
		/// \p TTVec
		/// Vector of disambiguating tokens.
		/// \p VecRetTy
		/// The return type of the intrinsic (optional)
		/// \p VecPtrTy
		/// The pointer operand type (optional)
		/// \p VectorTy
		/// The vector data type of the operation.
		static VPIntrinsic::ShortTypeVec
		EncodeTypeTokens(VPIntrinsic::TypeTokenVec TTVec, Type *VecRetTy,
		Type *VecPtrTy, Type &VectorTy);

		/// set the mask parameter.
		/// this asserts if the underlying intrinsic has no mask parameter.
		void setMaskParam(Value *);

		/// set the vector length parameter.
		/// this asserts if the underlying intrinsic has no vector length
		/// parameter.
		void setVectorLengthParam(Value *);

		/// \return the mask parameter or nullptr.
		Value *getMaskParam() const;

		/// \return the vector length parameter or nullptr.
		Value *getVectorLengthParam() const;

		/// \return whether the vector length param can be ignored.
		bool canIgnoreVectorLengthParam() const;

		/// \return the alignment of the pointer used by this load/store/gather or scatter.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// \return the alignment of the pointer used by this load/store/gather or scatter. + /// \return the alignment of the pointer used by this load/store/gather or + /// scatter. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// \return the alignment of the pointer used…
		MaybeAlign getPointerAlignment() const;
		// MaybeAlign setPointerAlignment(Align NewAlign); // TODO

		/// \return The pointer operand of this load,store, gather or scatter.
		Value *getMemoryPointerParam() const;
		static Optional<int> GetMemoryPointerParamPos(Intrinsic::ID);

		/// \return The data (payload) operand of this store or scatter.
		Value *getMemoryDataParam() const;
		static Optional<int> GetMemoryDataParamPos(Intrinsic::ID);

		/// \return The vector to reduce if this is a reduction operation.
		Value *getReductionVectorParam() const;
		static Optional<int> GetReductionVectorParamPos(Intrinsic::ID VPID);

		/// \return The initial value of this is a reduction operation.
		Value *getReductionAccuParam() const;
		static Optional<int> GetReductionAccuParamPos(Intrinsic::ID VPID);

		/// \return the static element count (vector number of elements) the vector
		/// length parameter applies to.
		ElementCount getVectorLength() const;

		bool isUnaryOp() const;
		static bool IsUnaryVPOp(Intrinsic::ID);
		bool isBinaryOp() const;
		static bool IsBinaryVPOp(Intrinsic::ID);
		bool isTernaryOp() const;
		static bool IsTernaryVPOp(Intrinsic::ID);

		/// \returns Whether this is a comparison operation.
		bool isCompareOp() const;
		static bool IsCompareVPOp(Intrinsic::ID);

		/// \returns The comparison predicate.
		CmpInst::Predicate getCmpPredicate() const;

		// Contrained fp-math
		// whether this is an fp op with non-standard rounding or exception
		// behavior.
		bool isConstrainedOp() const;

		// the specified rounding mode.
		Optional<fp::RoundingMode> getRoundingMode() const;
		// the specified exception behavior.
		Optional<fp::ExceptionBehavior> getExceptionBehavior() const;

		// llvm.vp.reduction.*
		bool isReductionOp() const;
		static bool IsVPReduction(Intrinsic::ID VPIntrin);

		// Methods for support type inquiry through isa, cast, and dyn_cast:
		static bool classof(const IntrinsicInst *I) {
		return IsVPIntrinsic(I->getIntrinsicID());
		}
		static bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}

		/// \return The non-VP intrinsic that is functionally equivalent to this VP
		/// intrinsic.
		Intrinsic::ID getFunctionalIntrinsicID() const {
		Intrinsic::ID IID = Intrinsic::not_intrinsic;
		// Return a constrained intrinsic if this intrinsic does not operate in
		// the standard fp environment.
		if (isConstrainedOp()) {
		IID = GetConstrainedIntrinsicForVP(getIntrinsicID());
		}
		if (IID == Intrinsic::not_intrinsic) {
		IID = GetFunctionalIntrinsicForVP(getIntrinsicID());
		}
		return IID;
		}

		/// \return The llvm.experimental.constrained.* intrinsic that is
		/// functionally equivalent to this llvm.vp.* intrinsic.
		static Intrinsic::ID GetConstrainedIntrinsicForVP(Intrinsic::ID VPID);

		/// \return The intrinsic that is
		/// functionally equivalent to this llvm.vp.* intrinsic.
		static Intrinsic::ID GetFunctionalIntrinsicForVP(Intrinsic::ID VPID);

		// Equivalent non-predicated opcode
		unsigned getFunctionalOpcode() const {
		if (isConstrainedOp()) {
		return Instruction::Call;
		}
		return GetFunctionalOpcodeForVP(getIntrinsicID());
		}

		// Equivalent non-predicated opcode
		static unsigned GetFunctionalOpcodeForVP(Intrinsic::ID ID);
		};

/// This is the common base class for constrained floating point intrinsics.		/// This is the common base class for constrained floating point intrinsics.
class ConstrainedFPIntrinsic : public IntrinsicInst {		class ConstrainedFPIntrinsic : public IntrinsicInst {
public:		public:

		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
bool isUnaryOp() const;		bool isUnaryOp() const;
bool isTernaryOp() const;		bool isTernaryOp() const;
Optional<fp::RoundingMode> getRoundingMode() const;		Optional<fp::RoundingMode> getRoundingMode() const;
Optional<fp::ExceptionBehavior> getExceptionBehavior() const;		Optional<fp::ExceptionBehavior> getExceptionBehavior() const;

// Methods for support type inquiry through isa, cast, and dyn_cast:		// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const IntrinsicInst *I);		static bool classof(const IntrinsicInst *I);
static bool classof(const Value *V) {		static bool classof(const Value *V) {
▲ Show 20 Lines • Show All 673 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show All 21 Lines
	// Intr*Mem - Memory properties. If no property is set, the worst case			// Intr*Mem - Memory properties. If no property is set, the worst case
	// is assumed (it may read and write any memory it can get access to and it may			// is assumed (it may read and write any memory it can get access to and it may
	// have other side effects).			// have other side effects).

	// IntrNoMem - The intrinsic does not access memory or have any other side			// IntrNoMem - The intrinsic does not access memory or have any other side
	// effects. It may be CSE'd deleted if dead, etc.			// effects. It may be CSE'd deleted if dead, etc.
	def IntrNoMem : IntrinsicProperty;			def IntrNoMem : IntrinsicProperty;

				// IntrNoSync - Threads executing the intrinsic will not synchronize using
				// memory or other means.
				def IntrNoSync : IntrinsicProperty;

	// IntrReadMem - This intrinsic only reads from memory. It does not write to			// IntrReadMem - This intrinsic only reads from memory. It does not write to
	// memory and has no other side effects. Therefore, it cannot be moved across			// memory and has no other side effects. Therefore, it cannot be moved across
	// potentially aliasing stores. However, it can be reordered otherwise and can			// potentially aliasing stores. However, it can be reordered otherwise and can
	// be deleted if dead.			// be deleted if dead.
	def IntrReadMem : IntrinsicProperty;			def IntrReadMem : IntrinsicProperty;

	// IntrWriteMem - This intrinsic only writes to memory, but does not read from			// IntrWriteMem - This intrinsic only writes to memory, but does not read from
	// memory, and has no other side effects. This means dead stores before calls			// memory, and has no other side effects. This means dead stores before calls
	▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
	}			}

	// ReadNone - The specified argument pointer is not dereferenced by the			// ReadNone - The specified argument pointer is not dereferenced by the
	// intrinsic.			// intrinsic.
	class ReadNone<int argNo> : IntrinsicProperty {			class ReadNone<int argNo> : IntrinsicProperty {
	int ArgNo = argNo;			int ArgNo = argNo;
	}			}

				// VectorLength - The specified argument is the Dynamic Vector Length of the
				// operation.
				class VectorLength<int argNo> : IntrinsicProperty {
				int ArgNo = argNo;
				}

				// Mask - The specified argument contains the per-lane mask of this
				// intrinsic. Inputs on masked-out lanes must not affect the result of this
				// intrinsic (except for the Passthru argument).
				class Mask<int argNo> : IntrinsicProperty {
				int ArgNo = argNo;
				}
				// Passthru - The specified argument contains the per-lane return value
				// for this vector intrinsic where the mask is false.
				// (requires the Mask attribute in the same function)
				class Passthru<int argNo> : IntrinsicProperty {
				int ArgNo = argNo;
				}

	def IntrNoReturn : IntrinsicProperty;			def IntrNoReturn : IntrinsicProperty;

	def IntrWillReturn : IntrinsicProperty;			def IntrWillReturn : IntrinsicProperty;

	// IntrCold - Calls to this intrinsic are cold.			// IntrCold - Calls to this intrinsic are cold.
	// Parallels the cold attribute on LLVM IR functions.			// Parallels the cold attribute on LLVM IR functions.
	def IntrCold : IntrinsicProperty;			def IntrCold : IntrinsicProperty;

	▲ Show 20 Lines • Show All 1,039 Lines • ▼ Show 20 Lines

	// Intrinsic to detect whether its argument is a constant.			// Intrinsic to detect whether its argument is a constant.
	def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem, IntrWillReturn], "llvm.is.constant">;			def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem, IntrWillReturn], "llvm.is.constant">;

	// Intrinsic to mask out bits of a pointer.			// Intrinsic to mask out bits of a pointer.
	def int_ptrmask: Intrinsic<[llvm_anyptr_ty], [llvm_anyptr_ty, llvm_anyint_ty],			def int_ptrmask: Intrinsic<[llvm_anyptr_ty], [llvm_anyptr_ty, llvm_anyint_ty],
	[IntrNoMem, IntrSpeculatable, IntrWillReturn]>;			[IntrNoMem, IntrSpeculatable, IntrWillReturn]>;

				//===---------------- Vector Predication Intrinsics --------------===//

				// Memory Intrinsics
				def int_vp_store : Intrinsic<[],
				[ llvm_anyvector_ty,
				LLVMAnyPointerType<LLVMMatchType<0>>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ NoCapture<1>, IntrNoSync, IntrArgMemOnly, IntrWillReturn, Mask<2>, VectorLength<3> ]>;

				def int_vp_load : Intrinsic<[ llvm_anyvector_ty],
				[ LLVMAnyPointerType<LLVMMatchType<0>>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ NoCapture<0>, IntrNoSync, IntrReadMem, IntrWillReturn, IntrArgMemOnly, Mask<1>, VectorLength<2> ]>;

				def int_vp_gather: Intrinsic<[ llvm_anyvector_ty],
				[ LLVMVectorOfAnyPointersToElt<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrReadMem, IntrNoSync, IntrWillReturn, IntrArgMemOnly, Mask<1>, VectorLength<2> ]>;

				def int_vp_scatter: Intrinsic<[],
				[ llvm_anyvector_ty,
				LLVMVectorOfAnyPointersToElt<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrArgMemOnly, IntrNoSync, IntrWillReturn, Mask<2>, VectorLength<3> ]>;
				// TODO allow IntrNoCapture for vectors of pointers

				// Reductions
				let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn, Mask<1>, VectorLength<2>] in {
				def int_vp_reduce_add : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_mul : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_and : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_or : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_xor : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_smax : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_smin : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_umax : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_umin : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_fmax : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_fmin : Intrinsic<[LLVMVectorElementType<0>],
				[llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				}

				let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn, Mask<2>, VectorLength<3>] in {
				def int_vp_reduce_fadd : Intrinsic<[LLVMVectorElementType<0>],
				[LLVMVectorElementType<0>,
				llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_reduce_fmul : Intrinsic<[LLVMVectorElementType<0>],
				[LLVMVectorElementType<0>,
				llvm_anyvector_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				}

				// Binary operators
				let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn] in {
				def int_vp_add : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_sub : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_mul : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_sdiv : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_udiv : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_srem : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_urem : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				// Logical operators
				def int_vp_ashr : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_lshr : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_shl : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_or : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_and : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_xor : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				}

				// Comparison
				// TODO add signalling fcmp
				// The last argument is the comparison predicate
				def int_vp_icmp : Intrinsic<[ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty> ],
				[ llvm_anyvector_ty,
				LLVMMatchType<0>,
				llvm_i8_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ],
				[ IntrWillReturn, IntrNoSync, IntrNoMem, Mask<3>, VectorLength<4>, ImmArg<2> ]>;

				def int_vp_fcmp : Intrinsic<[ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty> ],
				[ llvm_anyvector_ty,
				LLVMMatchType<0>,
				llvm_i8_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ],
				[ IntrWillReturn, IntrNoSync, IntrNoMem, Mask<3>, VectorLength<4>, ImmArg<2> ]>;



				// Shuffle
				def int_vp_vshift: Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				llvm_i32_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<2>, VectorLength<3> ]>;

				def int_vp_expand: Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<1>, VectorLength<2> ]>;

				def int_vp_compress: Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, VectorLength<2> ]>;

				// Select
				def int_vp_select : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_i32_ty],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, Passthru<2>, Mask<0>, VectorLength<3> ]>;

				// Compose
				def int_vp_compose : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_i32_ty,
				llvm_i32_ty],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, VectorLength<3> ]>;



				// VP fp rounding and truncation
				let IntrProperties = [ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<2>, VectorLength<3> ] in {

				def int_vp_fptosi : Intrinsic<[ llvm_anyint_ty ],
				[ llvm_anyfloat_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				def int_vp_fptoui : Intrinsic<[ llvm_anyint_ty ],
				[ llvm_anyfloat_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				def int_vp_fpext : Intrinsic<[ llvm_anyfloat_ty ],
				[ llvm_anyfloat_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				}
				let IntrProperties = [ IntrNoMem, IntrWillReturn, Mask<3>, VectorLength<4> ] in {
				def int_vp_sitofp : Intrinsic<[ llvm_anyfloat_ty ],
				[ llvm_anyint_ty,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				def int_vp_uitofp : Intrinsic<[ llvm_anyfloat_ty ],
				[ llvm_anyint_ty,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				}


				let IntrProperties = [ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<3>, VectorLength<4> ] in {
				def int_vp_fptrunc : Intrinsic<[ llvm_anyfloat_ty ],
				[ llvm_anyfloat_ty,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				}

				// VP single argument constrained intrinsics.
				let IntrProperties = [ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<3>, VectorLength<4> ] in {
				// These intrinsics are sensitive to the rounding mode so we need constrained
				// versions of each of them. When strict rounding and exception control are
				// not required the non-constrained versions of these intrinsics should be
				// used.
				def int_vp_sqrt : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_sin : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_cos : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_log : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_log10: Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_log2 : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_exp : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_exp2 : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_rint : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_nearbyint : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_ceil : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_floor : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_round : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_trunc : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				}


				// VP two argument constrained intrinsics.
				let IntrProperties = [ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<4>, VectorLength<5> ] in {
				// These intrinsics are sensitive to the rounding mode so we need constrained
				// versions of each of them. When strict rounding and exception control are
				// not required the non-constrained versions of these intrinsics should be
				// used.
				def int_vp_powi : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				llvm_i32_ty,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_pow : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_maxnum : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;
				def int_vp_minnum : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty]>;

				}


				// VP standard fp-math intrinsics.
				def int_vp_fneg : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty],
				[ IntrNoMem, IntrWillReturn, Mask<2>, VectorLength<3> ]>;

				let IntrProperties = [ IntrNoMem, IntrWillReturn, Mask<4>, VectorLength<5> ] in {
				// These intrinsics are sensitive to the rounding mode so we need constrained
				// versions of each of them. When strict rounding and exception control are
				// not required the non-constrained versions of these intrinsics should be
				// used.
				def int_vp_fadd : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ]>;
				def int_vp_fsub : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ]>;
				def int_vp_fmul : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ]>;
				def int_vp_fdiv : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ]>;
				def int_vp_frem : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ]>;
				}

				def int_vp_fma : Intrinsic<[ llvm_anyvector_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty,
				LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
				llvm_i32_ty ],
				[ IntrNoMem, IntrNoSync, IntrWillReturn, Mask<5>, VectorLength<6> ]>;




	//===-------------------------- Masked Intrinsics -------------------------===//			//===-------------------------- Masked Intrinsics -------------------------===//
	//			// TODO poised for deprecation (to be superseded by llvm.vp.* intrinsics)
	def int_masked_store : Intrinsic<[], [llvm_anyvector_ty,			def int_masked_store : Intrinsic<[], [llvm_anyvector_ty,
	LLVMAnyPointerType<LLVMMatchType<0>>,			LLVMAnyPointerType<LLVMMatchType<0>>,
	llvm_i32_ty,			llvm_i32_ty,
	LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],			LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],
	[IntrArgMemOnly, IntrWillReturn, ImmArg<2>]>;			[IntrArgMemOnly, IntrWillReturn, ImmArg<2>]>;

	def int_masked_load : Intrinsic<[llvm_anyvector_ty],			def int_masked_load : Intrinsic<[llvm_anyvector_ty],
	[LLVMAnyPointerType<LLVMMatchType<0>>, llvm_i32_ty,			[LLVMAnyPointerType<LLVMMatchType<0>>, llvm_i32_ty,
	▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines

	// @llvm.memset.element.unordered.atomic.*(dest, value, length, elementsize)			// @llvm.memset.element.unordered.atomic.*(dest, value, length, elementsize)
	def int_memset_element_unordered_atomic			def int_memset_element_unordered_atomic
	: Intrinsic<[], [ llvm_anyptr_ty, llvm_i8_ty, llvm_anyint_ty, llvm_i32_ty ],			: Intrinsic<[], [ llvm_anyptr_ty, llvm_i8_ty, llvm_anyint_ty, llvm_i32_ty ],
	[ IntrWriteMem, IntrArgMemOnly, IntrWillReturn, NoCapture<0>, WriteOnly<0>,			[ IntrWriteMem, IntrArgMemOnly, IntrWillReturn, NoCapture<0>, WriteOnly<0>,
	ImmArg<3> ]>;			ImmArg<3> ]>;

	//===------------------------ Reduction Intrinsics ------------------------===//			//===------------------------ Reduction Intrinsics ------------------------===//
				// TODO poised for deprecation (to be superseded by llvm.vp.*. intrinsics)
	//			//
	let IntrProperties = [IntrNoMem, IntrWillReturn] in {			let IntrProperties = [IntrNoMem, IntrWillReturn] in {
	def int_experimental_vector_reduce_v2_fadd : Intrinsic<[llvm_anyfloat_ty],			def int_experimental_vector_reduce_v2_fadd : Intrinsic<[llvm_anyfloat_ty],
	[LLVMMatchType<0>,			[LLVMMatchType<0>,
	llvm_anyvector_ty]>;			llvm_anyvector_ty]>;
	def int_experimental_vector_reduce_v2_fmul : Intrinsic<[llvm_anyfloat_ty],			def int_experimental_vector_reduce_v2_fmul : Intrinsic<[llvm_anyfloat_ty],
	[LLVMMatchType<0>,			[LLVMMatchType<0>,
	llvm_anyvector_ty]>;			llvm_anyvector_ty]>;
	▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/include/llvm/IR/MatcherCast.h

This file was added.

				#ifndef LLVM_IR_MATCHERCAST_H
				#define LLVM_IR_MATCHERCAST_H

				//===- MatcherCast.h - Match on the LLVM IR --------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Parameterized class hierachy for templatized pattern matching.
				//
				//===----------------------------------------------------------------------===//


				namespace llvm {
				namespace PatternMatch {


				// type modification
				template<typename Matcher, typename DestClass>
				struct MatcherCast { };

				// whether the Value \p Obj behaves like a \p Class.
				template<typename MatcherClass, typename Class>
				bool match_isa(const Value* Obj) {
				using UnconstClass = typename std::remove_cv<Class>::type;
				using DestClass = typename MatcherCast<MatcherClass, UnconstClass>::ActualCastType;
				return isa<const DestClass>(Obj);
				}

				template<typename MatcherClass, typename Class>
				auto match_cast(const Value* Obj) {
				using UnconstClass = typename std::remove_cv<Class>::type;
				using DestClass = typename MatcherCast<MatcherClass, UnconstClass>::ActualCastType;
				return cast<const DestClass>(Obj);
				}
				template<typename MatcherClass, typename Class>
				auto match_dyn_cast(const Value* Obj) {
				using UnconstClass = typename std::remove_cv<Class>::type;
				using DestClass = typename MatcherCast<MatcherClass, UnconstClass>::ActualCastType;
				return dyn_cast<const DestClass>(Obj);
				}

				template<typename MatcherClass, typename Class>
				auto match_cast(Value* Obj) {
				using UnconstClass = typename std::remove_cv<Class>::type;
				using DestClass = typename MatcherCast<MatcherClass, UnconstClass>::ActualCastType;
				return cast<DestClass>(Obj);
				}
				template<typename MatcherClass, typename Class>
				auto match_dyn_cast(Value* Obj) {
				using UnconstClass = typename std::remove_cv<Class>::type;
				using DestClass = typename MatcherCast<MatcherClass, UnconstClass>::ActualCastType;
				return dyn_cast<DestClass>(Obj);
				}


				} // namespace PatternMatch

				} // namespace llvm

				#endif // LLVM_IR_MATCHERCAST_H

llvm/include/llvm/IR/PatternMatch.h

Show All 31 Lines
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/MatcherCast.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/MatcherCast.h" ```
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
		#include "llvm/IR/MatcherCast.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/MatcherCast.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/MatcherCast.h" ```

#include <cstdint>		#include <cstdint>


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
namespace llvm {		namespace llvm {
namespace PatternMatch {		namespace PatternMatch {

		// Use verbatim types in default (empty) context.
		struct EmptyContext {
		EmptyContext() {}

		EmptyContext(const Value *) {}

		EmptyContext(const EmptyContext & E) {}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - EmptyContext(const EmptyContext & E) {} + EmptyContext(const EmptyContext &E) {} Lint: Pre-merge checks: clang-format: please reformat the code ``` - EmptyContext(const EmptyContext & E) {} +…

		// reset this match context to be rooted at \p V
		void reset(Value * V) {}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - void reset(Value * V) {} + void reset(Value V) {} Lint: Pre-merge checks:* clang-format: please reformat the code ``` - void reset(Value * V) {} + void reset(Value *V)…

		// accept a match where \p Val is in a non-leaf position in a match pattern
		bool acceptInnerNode(const Value * Val) const { return true; }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool acceptInnerNode(const Value * Val) const { return true; } + bool acceptInnerNode(const Value Val) const { return true; } Lint: Pre-merge checks:* clang-format: please reformat the code ``` - bool acceptInnerNode(const Value * Val) const {…

		// accept a match where \p Val is bound to a free variable.
		bool acceptBoundNode(const Value * Val) const { return true; }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool acceptBoundNode(const Value * Val) const { return true; } + bool acceptBoundNode(const Value Val) const { return true; } Lint: Pre-merge checks:* clang-format: please reformat the code ``` - bool acceptBoundNode(const Value * Val) const {…

		// whether this context is compatiable with \p E.
		bool acceptContext(EmptyContext E) const { return true; }

		// merge the context \p E into this context and return whether the resulting context is valid.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // merge the context \p E into this context and return whether the resulting context is valid. + // merge the context \p E into this context and return whether the resulting + // context is valid. Lint: Pre-merge checks: clang-format: please reformat the code ``` - // merge the context \p E into this context and…
		bool mergeContext(EmptyContext E) { return true; }

		// reset this context to \p Val.
		template <typename Val, typename Pattern> bool reset_match(Val *V, const Pattern &P) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename Val, typename Pattern> bool reset_match(Val V, const Pattern &P) { + template <typename Val, typename Pattern> + bool reset_match(Val V, const Pattern &P) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename Val, typename Pattern> bool…
		reset(V);
		return const_cast<Pattern &>(P).match_context(V, *this);
		}

		// match in the current context
		template <typename Val, typename Pattern> bool try_match(Val *V, const Pattern &P) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename Val, typename Pattern> bool try_match(Val V, const Pattern &P) { + template <typename Val, typename Pattern> + bool try_match(Val V, const Pattern &P) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename Val, typename Pattern> bool…
		return const_cast<Pattern &>(P).match_context(V, *this);
		}
		};

		template<typename DestClass>
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<typename DestClass> -struct MatcherCast<EmptyContext, DestClass> { using ActualCastType = DestClass; }; - - - - - +template <typename DestClass> struct MatcherCast<EmptyContext, DestClass> { + using ActualCastType = DestClass; +}; Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<typename DestClass> -struct…
		struct MatcherCast<EmptyContext, DestClass> { using ActualCastType = DestClass; };






		// match without (== empty) context
template <typename Val, typename Pattern> bool match(Val *V, const Pattern &P) {		template <typename Val, typename Pattern> bool match(Val *V, const Pattern &P) {
return const_cast<Pattern &>(P).match(V);		EmptyContext ECtx;
		return const_cast<Pattern &>(P).match_context(V, ECtx);
		}

		// match pattern in a given context
		template <typename Val, typename Pattern, typename MatchContext> bool match(Val *V, const Pattern &P, MatchContext & MContext) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template <typename Val, typename Pattern, typename MatchContext> bool match(Val V, const Pattern &P, MatchContext & MContext) { +template <typename Val, typename Pattern, typename MatchContext> +bool match(Val V, const Pattern &P, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -template <typename Val, typename Pattern, typename…
		return const_cast<Pattern &>(P).match_context(V, MContext);
}		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - - Lint: Pre-merge checks: clang-format: please reformat the code ``` - - ```

template <typename SubPattern_t> struct OneUse_match {		template <typename SubPattern_t> struct OneUse_match {
SubPattern_t SubPattern;		SubPattern_t SubPattern;

OneUse_match(const SubPattern_t &SP) : SubPattern(SP) {}		OneUse_match(const SubPattern_t &SP) : SubPattern(SP) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) {
return V->hasOneUse() && SubPattern.match(V);		EmptyContext EContext; return match_context(V, EContext);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - EmptyContext EContext; return match_context(V, EContext); + EmptyContext EContext; + return match_context(V, EContext); Lint: Pre-merge checks: clang-format: please reformat the code ``` - EmptyContext EContext; return match_context(V…
		}

		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy, typename MatchContext>…
		return V->hasOneUse() && SubPattern.match_context(V, MContext);
}		}
};		};

template <typename T> inline OneUse_match<T> m_OneUse(const T &SubPattern) {		template <typename T> inline OneUse_match<T> m_OneUse(const T &SubPattern) {
return SubPattern;		return SubPattern;
}		}

template <typename Class> struct class_match {		template <typename Class> struct class_match {
template <typename ITy> bool match(ITy *V) { return isa<Class>(V); }		template <typename ITy> bool match(ITy *V) {
		EmptyContext EContext; return match_context<ITy, EmptyContext>(V, EContext);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - EmptyContext EContext; return match_context<ITy, EmptyContext>(V, EContext); + EmptyContext EContext; + return match_context<ITy, EmptyContext>(V, EContext); Lint: Pre-merge checks: clang-format: please reformat the code ``` - EmptyContext EContext; return match_context<ITy…
		}
		template <typename ITy, typename MatchContext>
		bool match_context(ITy *V, MatchContext & MContext) { return match_isa<MatchContext, Class>(V); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool match_context(ITy V, MatchContext & MContext) { return match_isa<MatchContext, Class>(V); } + bool match_context(ITy V, MatchContext &MContext) { + return match_isa<MatchContext, Class>(V); + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - bool match_context(ITy *V, MatchContext &…
};		};

/// Match an arbitrary value and ignore it.		/// Match an arbitrary value and ignore it.
inline class_match<Value> m_Value() { return class_match<Value>(); }		inline class_match<Value> m_Value() { return class_match<Value>(); }

/// Match an arbitrary binary operation and ignore it.		/// Match an arbitrary binary operation and ignore it.
inline class_match<BinaryOperator> m_BinOp() {		inline class_match<BinaryOperator> m_BinOp() {
return class_match<BinaryOperator>();		return class_match<BinaryOperator>();
Show All 34 Lines

/// Matching combinators		/// Matching combinators
template <typename LTy, typename RTy> struct match_combine_or {		template <typename LTy, typename RTy> struct match_combine_or {
LTy L;		LTy L;
RTy R;		RTy R;

match_combine_or(const LTy &Left, const RTy &Right) : L(Left), R(Right) {}		match_combine_or(const LTy &Left, const RTy &Right) : L(Left), R(Right) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
if (L.match(V))		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
		MatchContext SubContext;

		if (L.match_context(V, SubContext) && MContext.acceptContext(SubContext)) {
		MContext.mergeContext(SubContext);
return true;		return true;
if (R.match(V))		}
		if (R.match_context(V, MContext)) {
return true;		return true;
		}
return false;		return false;
}		}
};		};

template <typename LTy, typename RTy> struct match_combine_and {		template <typename LTy, typename RTy> struct match_combine_and {
LTy L;		LTy L;
RTy R;		RTy R;

match_combine_and(const LTy &Left, const RTy &Right) : L(Left), R(Right) {}		match_combine_and(const LTy &Left, const RTy &Right) : L(Left), R(Right) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
if (L.match(V))		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (R.match(V))		if (L.match_context(V, MContext))
		if (R.match_context(V, MContext))
return true;		return true;
return false;		return false;
}		}
};		};

/// Combine two pattern matchers matching L \|\| R		/// Combine two pattern matchers matching L \|\| R
template <typename LTy, typename RTy>		template <typename LTy, typename RTy>
inline match_combine_or<LTy, RTy> m_CombineOr(const LTy &L, const RTy &R) {		inline match_combine_or<LTy, RTy> m_CombineOr(const LTy &L, const RTy &R) {
return match_combine_or<LTy, RTy>(L, R);		return match_combine_or<LTy, RTy>(L, R);
}		}

/// Combine two pattern matchers matching L && R		/// Combine two pattern matchers matching L && R
template <typename LTy, typename RTy>		template <typename LTy, typename RTy>
inline match_combine_and<LTy, RTy> m_CombineAnd(const LTy &L, const RTy &R) {		inline match_combine_and<LTy, RTy> m_CombineAnd(const LTy &L, const RTy &R) {
return match_combine_and<LTy, RTy>(L, R);		return match_combine_and<LTy, RTy>(L, R);
}		}

struct apint_match {		struct apint_match {
const APInt *&Res;		const APInt *&Res;
bool AllowUndef;		bool AllowUndef;

apint_match(const APInt *&Res, bool AllowUndef)		apint_match(const APInt *&Res, bool AllowUndef)
: Res(Res), AllowUndef(AllowUndef) {}		: Res(Res), AllowUndef(AllowUndef) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (auto *CI = dyn_cast<ConstantInt>(V)) {		if (auto *CI = dyn_cast<ConstantInt>(V)) {
Res = &CI->getValue();		Res = &CI->getValue();
return true;		return true;
}		}
if (V->getType()->isVectorTy())		if (V->getType()->isVectorTy())
if (const auto *C = dyn_cast<Constant>(V))		if (const auto *C = dyn_cast<Constant>(V))
if (auto *CI = dyn_cast_or_null<ConstantInt>(		if (auto *CI = dyn_cast_or_null<ConstantInt>(
C->getSplatValue(AllowUndef))) {		C->getSplatValue(AllowUndef))) {
Res = &CI->getValue();		Res = &CI->getValue();
return true;		return true;
}		}
return false;		return false;
}		}
};		};
// Either constexpr if or renaming ConstantFP::getValueAPF to		// Either constexpr if or renaming ConstantFP::getValueAPF to
// ConstantFP::getValue is needed to do it via single template		// ConstantFP::getValue is needed to do it via single template
// function for both apint/apfloat.		// function for both apint/apfloat.
struct apfloat_match {		struct apfloat_match {
const APFloat *&Res;		const APFloat *&Res;
bool AllowUndef;		bool AllowUndef;

apfloat_match(const APFloat *&Res, bool AllowUndef)		apfloat_match(const APFloat *&Res, bool AllowUndef)
: Res(Res), AllowUndef(AllowUndef) {}		: Res(Res), AllowUndef(AllowUndef) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (auto *CI = dyn_cast<ConstantFP>(V)) {		if (auto *CI = dyn_cast<ConstantFP>(V)) {
Res = &CI->getValueAPF();		Res = &CI->getValueAPF();
return true;		return true;
}		}
if (V->getType()->isVectorTy())		if (V->getType()->isVectorTy())
if (const auto *C = dyn_cast<Constant>(V))		if (const auto *C = dyn_cast<Constant>(V))
if (auto *CI = dyn_cast_or_null<ConstantFP>(		if (auto *CI = dyn_cast_or_null<ConstantFP>(
C->getSplatValue(AllowUndef))) {		C->getSplatValue(AllowUndef))) {
Show All 34 Lines
}		}

/// Match APFloat while forbidding undefs in splat vector constants.		/// Match APFloat while forbidding undefs in splat vector constants.
inline apfloat_match m_APFloatForbidUndef(const APFloat *&Res) {		inline apfloat_match m_APFloatForbidUndef(const APFloat *&Res) {
return apfloat_match(Res, /* AllowUndef */ false);		return apfloat_match(Res, /* AllowUndef */ false);
}		}

template <int64_t Val> struct constantint_match {		template <int64_t Val> struct constantint_match {
template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CI = dyn_cast<ConstantInt>(V)) {		if (const auto *CI = dyn_cast<ConstantInt>(V)) {
const APInt &CIV = CI->getValue();		const APInt &CIV = CI->getValue();
if (Val >= 0)		if (Val >= 0)
return CIV == static_cast<uint64_t>(Val);		return CIV == static_cast<uint64_t>(Val);
// If Val is negative, and CI is shorter than it, truncate to the right		// If Val is negative, and CI is shorter than it, truncate to the right
// number of bits. If it is larger, then we have to sign extend. Just		// number of bits. If it is larger, then we have to sign extend. Just
// compare their negated values.		// compare their negated values.
return -CIV == -Val;		return -CIV == -Val;
}		}
return false;		return false;
}		}
};		};

/// Match a ConstantInt with a specific value.		/// Match a ConstantInt with a specific value.
template <int64_t Val> inline constantint_match<Val> m_ConstantInt() {		template <int64_t Val> inline constantint_match<Val> m_ConstantInt() {
return constantint_match<Val>();		return constantint_match<Val>();
}		}

/// This helper class is used to match scalar and vector integer constants that		/// This helper class is used to match scalar and vector integer constants that
/// satisfy a specified predicate.		/// satisfy a specified predicate.
/// For vector constants, undefined elements are ignored.		/// For vector constants, undefined elements are ignored.
template <typename Predicate> struct cst_pred_ty : public Predicate {		template <typename Predicate> struct cst_pred_ty : public Predicate {
template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CI = dyn_cast<ConstantInt>(V))		if (const auto *CI = dyn_cast<ConstantInt>(V))
return this->isValue(CI->getValue());		return this->isValue(CI->getValue());
if (V->getType()->isVectorTy()) {		if (V->getType()->isVectorTy()) {
if (const auto *C = dyn_cast<Constant>(V)) {		if (const auto *C = dyn_cast<Constant>(V)) {
if (const auto *CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue()))		if (const auto *CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue()))
return this->isValue(CI->getValue());		return this->isValue(CI->getValue());

// Non-splat vector constant: check each element for a match.		// Non-splat vector constant: check each element for a match.
Show All 20 Lines

/// This helper class is used to match scalar and vector constants that		/// This helper class is used to match scalar and vector constants that
/// satisfy a specified predicate, and bind them to an APInt.		/// satisfy a specified predicate, and bind them to an APInt.
template <typename Predicate> struct api_pred_ty : public Predicate {		template <typename Predicate> struct api_pred_ty : public Predicate {
const APInt *&Res;		const APInt *&Res;

api_pred_ty(const APInt *&R) : Res(R) {}		api_pred_ty(const APInt *&R) : Res(R) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CI = dyn_cast<ConstantInt>(V))		if (const auto *CI = dyn_cast<ConstantInt>(V))
if (this->isValue(CI->getValue())) {		if (this->isValue(CI->getValue())) {
Res = &CI->getValue();		Res = &CI->getValue();
return true;		return true;
}		}
if (V->getType()->isVectorTy())		if (V->getType()->isVectorTy())
if (const auto *C = dyn_cast<Constant>(V))		if (const auto *C = dyn_cast<Constant>(V))
if (auto *CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue()))		if (auto *CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue()))
if (this->isValue(CI->getValue())) {		if (this->isValue(CI->getValue())) {
Res = &CI->getValue();		Res = &CI->getValue();
return true;		return true;
}		}

return false;		return false;
}		}
};		};

/// This helper class is used to match scalar and vector floating-point		/// This helper class is used to match scalar and vector floating-point
/// constants that satisfy a specified predicate.		/// constants that satisfy a specified predicate.
/// For vector constants, undefined elements are ignored.		/// For vector constants, undefined elements are ignored.
template <typename Predicate> struct cstfp_pred_ty : public Predicate {		template <typename Predicate> struct cstfp_pred_ty : public Predicate {
template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CF = dyn_cast<ConstantFP>(V))		if (const auto *CF = dyn_cast<ConstantFP>(V))
return this->isValue(CF->getValueAPF());		return this->isValue(CF->getValueAPF());
if (V->getType()->isVectorTy()) {		if (V->getType()->isVectorTy()) {
if (const auto *C = dyn_cast<Constant>(V)) {		if (const auto *C = dyn_cast<Constant>(V)) {
if (const auto *CF = dyn_cast_or_null<ConstantFP>(C->getSplatValue()))		if (const auto *CF = dyn_cast_or_null<ConstantFP>(C->getSplatValue()))
return this->isValue(CF->getValueAPF());		return this->isValue(CF->getValueAPF());

// Non-splat vector constant: check each element for a match.		// Non-splat vector constant: check each element for a match.
▲ Show 20 Lines • Show All 118 Lines • ▼ Show 20 Lines
};		};
/// Match an integer 0 or a vector with all elements equal to 0.		/// Match an integer 0 or a vector with all elements equal to 0.
/// For vectors, this includes constants with undefined elements.		/// For vectors, this includes constants with undefined elements.
inline cst_pred_ty<is_zero_int> m_ZeroInt() {		inline cst_pred_ty<is_zero_int> m_ZeroInt() {
return cst_pred_ty<is_zero_int>();		return cst_pred_ty<is_zero_int>();
}		}

struct is_zero {		struct is_zero {
template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
auto *C = dyn_cast<Constant>(V);		auto *C = dyn_cast<Constant>(V);
return C && (C->isNullValue() \|\| cst_pred_ty<is_zero_int>().match(C));		return C && (C->isNullValue() \|\| cst_pred_ty<is_zero_int>().match(C));
}		}
};		};
/// Match any null constant or a vector with all elements equal to 0.		/// Match any null constant or a vector with all elements equal to 0.
/// For vectors, this includes constants with undefined elements.		/// For vectors, this includes constants with undefined elements.
inline is_zero m_Zero() {		inline is_zero m_Zero() {
return is_zero();		return is_zero();
▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines

///////////////////////////////////////////////////////////////////////////////		///////////////////////////////////////////////////////////////////////////////

template <typename Class> struct bind_ty {		template <typename Class> struct bind_ty {
Class *&VR;		Class *&VR;

bind_ty(Class *&V) : VR(V) {}		bind_ty(Class *&V) : VR(V) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (auto *CV = dyn_cast<Class>(V)) {		if (auto *CV = dyn_cast<Class>(V)) {
		if (!MContext.acceptBoundNode(V)) return false;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!MContext.acceptBoundNode(V)) return false; + if (!MContext.acceptBoundNode(V)) + return false; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!MContext.acceptBoundNode(V)) return…

VR = CV;		VR = CV;
return true;		return true;
}		}
return false;		return false;
}		}
};		};

/// Match a value, capturing it if we match.		/// Match a value, capturing it if we match.
Show All 23 Lines
}		}

/// Match a specified Value*.		/// Match a specified Value*.
struct specificval_ty {		struct specificval_ty {
const Value *Val;		const Value *Val;

specificval_ty(const Value *V) : Val(V) {}		specificval_ty(const Value *V) : Val(V) {}

template <typename ITy> bool match(ITy *V) { return V == Val; }		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { return V == Val; } + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { + return V == Val; + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) { return V == Val; }
};		};

/// Match if we have a specific specified value.		/// Match if we have a specific specified value.
inline specificval_ty m_Specific(const Value *V) { return V; }		inline specificval_ty m_Specific(const Value *V) { return V; }

/// Stores a reference to the Value , not the Value itself,		/// Stores a reference to the Value , not the Value itself,
/// thus can be used in commutative matchers.		/// thus can be used in commutative matchers.
template <typename Class> struct deferredval_ty {		template <typename Class> struct deferredval_ty {
Class *const &Val;		Class *const &Val;

deferredval_ty(Class *const &V) : Val(V) {}		deferredval_ty(Class *const &V) : Val(V) {}

template <typename ITy> bool match(ITy *const V) { return V == Val; }		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy const V, MatchContext & MContext) { return V == Val; } + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy const V, MatchContext &MContext) { + return V == Val; + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *const V, MatchContext & MContext) { return V == Val; }
};		};

/// A commutative-friendly version of m_Specific().		/// A commutative-friendly version of m_Specific().
inline deferredval_ty<Value> m_Deferred(Value *const &V) { return V; }		inline deferredval_ty<Value> m_Deferred(Value *const &V) { return V; }
inline deferredval_ty<const Value> m_Deferred(const Value *const &V) {		inline deferredval_ty<const Value> m_Deferred(const Value *const &V) {
return V;		return V;
}		}

/// Match a specified floating point value or vector of all elements of		/// Match a specified floating point value or vector of all elements of
/// that value.		/// that value.
struct specific_fpval {		struct specific_fpval {
double Val;		double Val;

specific_fpval(double V) : Val(V) {}		specific_fpval(double V) : Val(V) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CFP = dyn_cast<ConstantFP>(V))		if (const auto *CFP = dyn_cast<ConstantFP>(V))
return CFP->isExactlyValue(Val);		return CFP->isExactlyValue(Val);
if (V->getType()->isVectorTy())		if (V->getType()->isVectorTy())
if (const auto *C = dyn_cast<Constant>(V))		if (const auto *C = dyn_cast<Constant>(V))
if (auto *CFP = dyn_cast_or_null<ConstantFP>(C->getSplatValue()))		if (auto *CFP = dyn_cast_or_null<ConstantFP>(C->getSplatValue()))
return CFP->isExactlyValue(Val);		return CFP->isExactlyValue(Val);
return false;		return false;
}		}
};		};

/// Match a specific floating point value or vector with all elements		/// Match a specific floating point value or vector with all elements
/// equal to the value.		/// equal to the value.
inline specific_fpval m_SpecificFP(double V) { return specific_fpval(V); }		inline specific_fpval m_SpecificFP(double V) { return specific_fpval(V); }

/// Match a float 1.0 or vector with all elements equal to 1.0.		/// Match a float 1.0 or vector with all elements equal to 1.0.
inline specific_fpval m_FPOne() { return m_SpecificFP(1.0); }		inline specific_fpval m_FPOne() { return m_SpecificFP(1.0); }

struct bind_const_intval_ty {		struct bind_const_intval_ty {
uint64_t &VR;		uint64_t &VR;

bind_const_intval_ty(uint64_t &V) : VR(V) {}		bind_const_intval_ty(uint64_t &V) : VR(V) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (const auto *CV = dyn_cast<ConstantInt>(V))		if (const auto *CV = dyn_cast<ConstantInt>(V))
if (CV->getValue().ule(UINT64_MAX)) {		if (CV->getValue().ule(UINT64_MAX)) {
VR = CV->getZExtValue();		VR = CV->getZExtValue();
return true;		return true;
}		}
return false;		return false;
}		}
};		};

/// Match a specified integer value or vector of all elements of that		/// Match a specified integer value or vector of all elements of that
/// value.		/// value.
struct specific_intval {		struct specific_intval {
APInt Val;		APInt Val;

specific_intval(APInt V) : Val(std::move(V)) {}		specific_intval(APInt V) : Val(std::move(V)) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
const auto *CI = dyn_cast<ConstantInt>(V);		const auto *CI = dyn_cast<ConstantInt>(V);
if (!CI && V->getType()->isVectorTy())		if (!CI && V->getType()->isVectorTy())
if (const auto *C = dyn_cast<Constant>(V))		if (const auto *C = dyn_cast<Constant>(V))
CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue());		CI = dyn_cast_or_null<ConstantInt>(C->getSplatValue());

return CI && APInt::isSameValue(CI->getValue(), Val);		return CI && APInt::isSameValue(CI->getValue(), Val);
}		}
};		};
Show All 13 Lines
inline bind_const_intval_ty m_ConstantInt(uint64_t &V) { return V; }		inline bind_const_intval_ty m_ConstantInt(uint64_t &V) { return V; }

/// Match a specified basic block value.		/// Match a specified basic block value.
struct specific_bbval {		struct specific_bbval {
BasicBlock *Val;		BasicBlock *Val;

specific_bbval(BasicBlock *Val) : Val(Val) {}		specific_bbval(BasicBlock *Val) : Val(Val) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EC; return match_context(V, EC); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EC; return match_context(V, EC); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EC; + return match_context(V, EC); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
const auto *BB = dyn_cast<BasicBlock>(V);		const auto *BB = dyn_cast<BasicBlock>(V);
return BB && BB == Val;		return BB && BB == Val;
}		}
};		};

/// Match a specific basic block value.		/// Match a specific basic block value.
inline specific_bbval m_SpecificBB(BasicBlock *BB) {		inline specific_bbval m_SpecificBB(BasicBlock *BB) {
return specific_bbval(BB);		return specific_bbval(BB);
Show All 15 Lines
struct AnyBinaryOp_match {		struct AnyBinaryOp_match {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

// The evaluation order is always stable, regardless of Commutability.		// The evaluation order is always stable, regardless of Commutability.
// The LHS is always matched first.		// The LHS is always matched first.
AnyBinaryOp_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}		AnyBinaryOp_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { - auto * I = match_dyn_cast<MatchContext, BinaryOperator>(V); - if (!I) return false; + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { + auto I = match_dyn_cast<MatchContext, BinaryOperator>(V); + if (!I) + return false; Lint: Pre-merge checks:* clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *I = dyn_cast<BinaryOperator>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
return (L.match(I->getOperand(0)) && R.match(I->getOperand(1))) \|\|		auto * I = match_dyn_cast<MatchContext, BinaryOperator>(V);
(Commutable && L.match(I->getOperand(1)) &&		if (!I) return false;
R.match(I->getOperand(0)));
		if (!MContext.acceptInnerNode(I)) return false;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!MContext.acceptInnerNode(I)) return false; + if (!MContext.acceptInnerNode(I)) + return false; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!MContext.acceptInnerNode(I)) return false…

		MatchContext LRContext(MContext);
		if (L.match_context(I->getOperand(0), LRContext) && R.match_context(I->getOperand(1), LRContext) && MContext.mergeContext(LRContext)) return true;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (L.match_context(I->getOperand(0), LRContext) && R.match_context(I->getOperand(1), LRContext) && MContext.mergeContext(LRContext)) return true; - if (Commutable && (L.match_context(I->getOperand(1), MContext) && R.match_context(I->getOperand(0), MContext))) return true; + if (L.match_context(I->getOperand(0), LRContext) && + R.match_context(I->getOperand(1), LRContext) && + MContext.mergeContext(LRContext)) + return true; + if (Commutable && (L.match_context(I->getOperand(1), MContext) && + R.match_context(I->getOperand(0), MContext))) + return true; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (L.match_context(I->getOperand(0)…
		if (Commutable && (L.match_context(I->getOperand(1), MContext) && R.match_context(I->getOperand(0), MContext))) return true;
return false;		return false;
}		}
};		};

template <typename LHS, typename RHS>		template <typename LHS, typename RHS>
inline AnyBinaryOp_match<LHS, RHS> m_BinOp(const LHS &L, const RHS &R) {		inline AnyBinaryOp_match<LHS, RHS> m_BinOp(const LHS &L, const RHS &R) {
return AnyBinaryOp_match<LHS, RHS>(L, R);		return AnyBinaryOp_match<LHS, RHS>(L, R);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Matchers for specific binary operators.		// Matchers for specific binary operators.
//		//

template <typename LHS_t, typename RHS_t, unsigned Opcode,		template <typename LHS_t, typename RHS_t, unsigned Opcode,
bool Commutable = false>		bool Commutable = false>
struct BinaryOp_match {		struct BinaryOp_match {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

// The evaluation order is always stable, regardless of Commutability.		// The evaluation order is always stable, regardless of Commutability.
// The LHS is always matched first.		// The LHS is always matched first.
BinaryOp_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}		BinaryOp_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { - auto * I = match_dyn_cast<MatchContext, const BinaryOperator>(V); + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { + auto I = match_dyn_cast<MatchContext, const BinaryOperator>(V); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (V->getValueID() == Value::InstructionVal + Opcode) {		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
auto *I = cast<BinaryOperator>(V);		auto * I = match_dyn_cast<MatchContext, const BinaryOperator>(V);
return (L.match(I->getOperand(0)) && R.match(I->getOperand(1))) \|\|		if (I && I->getOpcode() == Opcode) {
(Commutable && L.match(I->getOperand(1)) &&		MatchContext LRContext(MContext);
R.match(I->getOperand(0)));		if (!MContext.acceptInnerNode(I)) return false;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!MContext.acceptInnerNode(I)) return false; - if (L.match_context(I->getOperand(0), LRContext) && R.match_context(I->getOperand(1), LRContext) && MContext.mergeContext(LRContext)) return true; - if (Commutable && (L.match_context(I->getOperand(1), MContext) && R.match_context(I->getOperand(0), MContext))) return true; + if (!MContext.acceptInnerNode(I)) + return false; + if (L.match_context(I->getOperand(0), LRContext) && + R.match_context(I->getOperand(1), LRContext) && + MContext.mergeContext(LRContext)) + return true; + if (Commutable && (L.match_context(I->getOperand(1), MContext) && + R.match_context(I->getOperand(0), MContext))) + return true; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!MContext.acceptInnerNode(I)) return…
		if (L.match_context(I->getOperand(0), LRContext) && R.match_context(I->getOperand(1), LRContext) && MContext.mergeContext(LRContext)) return true;
		if (Commutable && (L.match_context(I->getOperand(1), MContext) && R.match_context(I->getOperand(0), MContext))) return true;
		return false;
}		}
if (auto *CE = dyn_cast<ConstantExpr>(V))		if (auto *CE = dyn_cast<ConstantExpr>(V))
return CE->getOpcode() == Opcode &&		return CE->getOpcode() == Opcode &&
((L.match(CE->getOperand(0)) && R.match(CE->getOperand(1))) \|\|		((L.match(CE->getOperand(0)) && R.match(CE->getOperand(1))) \|\|
(Commutable && L.match(CE->getOperand(1)) &&		(Commutable && L.match(CE->getOperand(1)) &&
R.match(CE->getOperand(0))));		R.match(CE->getOperand(0))));
return false;		return false;
}		}
Show All 22 Lines	inline BinaryOp_match<LHS, RHS, Instruction::FSub> m_FSub(const LHS &L,
const RHS &R) {		const RHS &R) {
return BinaryOp_match<LHS, RHS, Instruction::FSub>(L, R);		return BinaryOp_match<LHS, RHS, Instruction::FSub>(L, R);
}		}

template <typename Op_t> struct FNeg_match {		template <typename Op_t> struct FNeg_match {
Op_t X;		Op_t X;

FNeg_match(const Op_t &Op) : X(Op) {}		FNeg_match(const Op_t &Op) : X(Op) {}
template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
auto *FPMO = dyn_cast<FPMathOperator>(V);		auto *FPMO = dyn_cast<FPMathOperator>(V);
if (!FPMO) return false;		if (!FPMO) return false;

if (FPMO->getOpcode() == Instruction::FNeg)		if (match_cast<MatchContext, const Operator>(V)->getOpcode() == Instruction::FNeg)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (match_cast<MatchContext, const Operator>(V)->getOpcode() == Instruction::FNeg) + if (match_cast<MatchContext, const Operator>(V)->getOpcode() == + Instruction::FNeg) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (match_cast<MatchContext, const Operator>(V)…
return X.match(FPMO->getOperand(0));		return X.match(FPMO->getOperand(0));

if (FPMO->getOpcode() == Instruction::FSub) {		if (match_cast<MatchContext, const Operator>(V)->getOpcode() == Instruction::FSub) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (match_cast<MatchContext, const Operator>(V)->getOpcode() == Instruction::FSub) { + if (match_cast<MatchContext, const Operator>(V)->getOpcode() == + Instruction::FSub) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (match_cast<MatchContext, const Operator>(V)…
if (FPMO->hasNoSignedZeros()) {		if (FPMO->hasNoSignedZeros()) {
// With 'nsz', any zero goes.		// With 'nsz', any zero goes.
if (!cstfp_pred_ty<is_any_zero_fp>().match(FPMO->getOperand(0)))		if (!cstfp_pred_ty<is_any_zero_fp>().match_context(FPMO->getOperand(0), MContext))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!cstfp_pred_ty<is_any_zero_fp>().match_context(FPMO->getOperand(0), MContext)) + if (!cstfp_pred_ty<is_any_zero_fp>().match_context(FPMO->getOperand(0), + MContext)) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!cstfp_pred_ty<is_any_zero_fp>().
return false;		return false;
} else {		} else {
// Without 'nsz', we need fsub -0.0, X exactly.		// Without 'nsz', we need fsub -0.0, X exactly.
if (!cstfp_pred_ty<is_neg_zero_fp>().match(FPMO->getOperand(0)))		if (!cstfp_pred_ty<is_neg_zero_fp>().match_context(FPMO->getOperand(0), MContext))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!cstfp_pred_ty<is_neg_zero_fp>().match_context(FPMO->getOperand(0), MContext)) + if (!cstfp_pred_ty<is_neg_zero_fp>().match_context(FPMO->getOperand(0), + MContext)) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!cstfp_pred_ty<is_neg_zero_fp>().
return false;		return false;
}		}

return X.match(FPMO->getOperand(1));		return X.match_context(FPMO->getOperand(1), MContext);
}		}

return false;		return false;
}		}
};		};

/// Match 'fneg X' as 'fsub -0.0, X'.		/// Match 'fneg X' as 'fsub -0.0, X'.
template <typename OpTy>		template <typename OpTy>
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	template <typename LHS_t, typename RHS_t, unsigned Opcode,
unsigned WrapFlags = 0>		unsigned WrapFlags = 0>
struct OverflowingBinaryOp_match {		struct OverflowingBinaryOp_match {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

OverflowingBinaryOp_match(const LHS_t &LHS, const RHS_t &RHS)		OverflowingBinaryOp_match(const LHS_t &LHS, const RHS_t &RHS)
: L(LHS), R(RHS) {}		: L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
if (auto *Op = dyn_cast<OverflowingBinaryOperator>(V)) {		if (auto *Op = dyn_cast<OverflowingBinaryOperator>(V)) {
if (Op->getOpcode() != Opcode)		if (Op->getOpcode() != Opcode)
return false;		return false;
if (WrapFlags & OverflowingBinaryOperator::NoUnsignedWrap &&		if (WrapFlags & OverflowingBinaryOperator::NoUnsignedWrap &&
!Op->hasNoUnsignedWrap())		!Op->hasNoUnsignedWrap())
return false;		return false;
if (WrapFlags & OverflowingBinaryOperator::NoSignedWrap &&		if (WrapFlags & OverflowingBinaryOperator::NoSignedWrap &&
!Op->hasNoSignedWrap())		!Op->hasNoSignedWrap())
return false;		return false;
return L.match(Op->getOperand(0)) && R.match(Op->getOperand(1));		return L.match_context(Op->getOperand(0), MContext) && R.match_context(Op->getOperand(1), MContext);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return L.match_context(Op->getOperand(0), MContext) && R.match_context(Op->getOperand(1), MContext); + return L.match_context(Op->getOperand(0), MContext) && + R.match_context(Op->getOperand(1), MContext); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return L.match_context(Op->getOperand(0)…
}		}
return false;		return false;
}		}
};		};

template <typename LHS, typename RHS>		template <typename LHS, typename RHS>
inline OverflowingBinaryOp_match<LHS, RHS, Instruction::Add,		inline OverflowingBinaryOp_match<LHS, RHS, Instruction::Add,
OverflowingBinaryOperator::NoSignedWrap>		OverflowingBinaryOperator::NoSignedWrap>
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
//		//
template <typename LHS_t, typename RHS_t, typename Predicate>		template <typename LHS_t, typename RHS_t, typename Predicate>
struct BinOpPred_match : Predicate {		struct BinOpPred_match : Predicate {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

BinOpPred_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}		BinOpPred_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *I = dyn_cast<Instruction>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
return this->isOpType(I->getOpcode()) && L.match(I->getOperand(0)) &&		if (auto *I = match_dyn_cast<MatchContext, Instruction>(V))
R.match(I->getOperand(1));		return this->isOpType(I->getOpcode()) && L.match_context(I->getOperand(0), MContext) &&
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return this->isOpType(I->getOpcode()) && L.match_context(I->getOperand(0), MContext) && + return this->isOpType(I->getOpcode()) && + L.match_context(I->getOperand(0), MContext) && Lint: Pre-merge checks: clang-format: please reformat the code ``` - return this->isOpType(I->getOpcode()) && L.
		R.match_context(I->getOperand(1), MContext);
if (auto *CE = dyn_cast<ConstantExpr>(V))		if (auto *CE = dyn_cast<ConstantExpr>(V))
return this->isOpType(CE->getOpcode()) && L.match(CE->getOperand(0)) &&		return this->isOpType(CE->getOpcode()) && L.match(CE->getOperand(0)) &&
R.match(CE->getOperand(1));		R.match(CE->getOperand(1));
return false;		return false;
}		}
};		};

struct is_shift_op {		struct is_shift_op {
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Class that matches exact binary ops.		// Class that matches exact binary ops.
//		//
template <typename SubPattern_t> struct Exact_match {		template <typename SubPattern_t> struct Exact_match {
SubPattern_t SubPattern;		SubPattern_t SubPattern;

Exact_match(const SubPattern_t &SP) : SubPattern(SP) {}		Exact_match(const SubPattern_t &SP) : SubPattern(SP) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
if (auto *PEO = dyn_cast<PossiblyExactOperator>(V))		if (auto *PEO = dyn_cast<PossiblyExactOperator>(V))
return PEO->isExact() && SubPattern.match(V);		return PEO->isExact() && SubPattern.match_context(V, MContext);
return false;		return false;
}		}
};		};

template <typename T> inline Exact_match<T> m_Exact(const T &SubPattern) {		template <typename T> inline Exact_match<T> m_Exact(const T &SubPattern) {
return SubPattern;		return SubPattern;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Matchers for CmpInst classes		// Matchers for CmpInst classes
//		//

template <typename LHS_t, typename RHS_t, typename Class, typename PredicateTy,		template <typename LHS_t, typename RHS_t, typename Class, typename PredicateTy,
bool Commutable = false>		bool Commutable = false>
struct CmpClass_match {		struct CmpClass_match {
PredicateTy &Predicate;		PredicateTy &Predicate;
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

// The evaluation order is always stable, regardless of Commutability.		// The evaluation order is always stable, regardless of Commutability.
// The LHS is always matched first.		// The LHS is always matched first.
CmpClass_match(PredicateTy &Pred, const LHS_t &LHS, const RHS_t &RHS)		CmpClass_match(PredicateTy &Pred, const LHS_t &LHS, const RHS_t &RHS)
: Predicate(Pred), L(LHS), R(RHS) {}		: Predicate(Pred), L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *I = dyn_cast<Class>(V)) {		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
if (L.match(I->getOperand(0)) && R.match(I->getOperand(1))) {		if (auto *I = match_dyn_cast<MatchContext, Class>(V)) {
		if (!MContext.acceptInnerNode(I))
		return false;

		MatchContext LRContext(MContext);
		if (L.match_context(I->getOperand(0), LRContext) &&
		R.match_context(I->getOperand(1), LRContext) &&
		MContext.mergeContext(LRContext)) {
Predicate = I->getPredicate();		Predicate = I->getPredicate();
return true;		return true;
} else if (Commutable && L.match(I->getOperand(1)) &&		}
R.match(I->getOperand(0))) {
		if (!Commutable)
		return false;

		MatchContext RLContext(MContext);
		if (L.match_context(I->getOperand(1), RLContext) &&
		R.match_context(I->getOperand(0), RLContext) &&
		MContext.mergeContext(RLContext)) {
Predicate = I->getSwappedPredicate();		Predicate = I->getSwappedPredicate();
return true;		return true;
}		}
}		}
return false;		return false;
}		}
};		};

Show All 20 Lines
//		//

/// Matches instructions with Opcode and three operands.		/// Matches instructions with Opcode and three operands.
template <typename T0, unsigned Opcode> struct OneOps_match {		template <typename T0, unsigned Opcode> struct OneOps_match {
T0 Op1;		T0 Op1;

OneOps_match(const T0 &Op1) : Op1(Op1) {}		OneOps_match(const T0 &Op1) : Op1(Op1) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (V->getValueID() == Value::InstructionVal + Opcode) {		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
auto *I = cast<Instruction>(V);		auto *I = match_dyn_cast<MatchContext, Instruction>(V);
return Op1.match(I->getOperand(0));		if (I && I->getOpcode() == Opcode && MContext.acceptInnerNode(I)) {
		return Op1.match_context(I->getOperand(0), MContext);
}		}
return false;		return false;
}		}
};		};

/// Matches instructions with Opcode and three operands.		/// Matches instructions with Opcode and three operands.
template <typename T0, typename T1, unsigned Opcode> struct TwoOps_match {		template <typename T0, typename T1, unsigned Opcode> struct TwoOps_match {
T0 Op1;		T0 Op1;
T1 Op2;		T1 Op2;

TwoOps_match(const T0 &Op1, const T1 &Op2) : Op1(Op1), Op2(Op2) {}		TwoOps_match(const T0 &Op1, const T1 &Op2) : Op1(Op1), Op2(Op2) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (V->getValueID() == Value::InstructionVal + Opcode) {		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
auto *I = cast<Instruction>(V);		auto *I = match_dyn_cast<MatchContext, Instruction>(V);
return Op1.match(I->getOperand(0)) && Op2.match(I->getOperand(1));		if (I && I->getOpcode() == Opcode && MContext.acceptInnerNode(I)) {
		return Op1.match_context(I->getOperand(0), MContext) &&
		Op2.match_context(I->getOperand(1), MContext);
}		}
return false;		return false;
}		}
};		};

/// Matches instructions with Opcode and three operands.		/// Matches instructions with Opcode and three operands.
template <typename T0, typename T1, typename T2, unsigned Opcode>		template <typename T0, typename T1, typename T2, unsigned Opcode>
struct ThreeOps_match {		struct ThreeOps_match {
T0 Op1;		T0 Op1;
T1 Op2;		T1 Op2;
T2 Op3;		T2 Op3;

ThreeOps_match(const T0 &Op1, const T1 &Op2, const T2 &Op3)		ThreeOps_match(const T0 &Op1, const T1 &Op2, const T2 &Op3)
: Op1(Op1), Op2(Op2), Op3(Op3) {}		: Op1(Op1), Op2(Op2), Op3(Op3) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (V->getValueID() == Value::InstructionVal + Opcode) {		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
auto *I = cast<Instruction>(V);		auto *I = match_dyn_cast<MatchContext, Instruction>(V);
return Op1.match(I->getOperand(0)) && Op2.match(I->getOperand(1)) &&		if (I && I->getOpcode() == Opcode && MContext.acceptInnerNode(I)) {
Op3.match(I->getOperand(2));		return Op1.match_context(I->getOperand(0), MContext) &&
		Op2.match_context(I->getOperand(1), MContext) &&
		Op3.match_context(I->getOperand(2), MContext);
}		}
return false;		return false;
}		}
};		};

/// Matches SelectInst.		/// Matches SelectInst.
template <typename Cond, typename LHS, typename RHS>		template <typename Cond, typename LHS, typename RHS>
inline ThreeOps_match<Cond, LHS, RHS, Instruction::Select>		inline ThreeOps_match<Cond, LHS, RHS, Instruction::Select>
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
// Matchers for CastInst classes		// Matchers for CastInst classes
//		//

template <typename Op_t, unsigned Opcode> struct CastClass_match {		template <typename Op_t, unsigned Opcode> struct CastClass_match {
Op_t Op;		Op_t Op;

CastClass_match(const Op_t &OpMatch) : Op(OpMatch) {}		CastClass_match(const Op_t &OpMatch) : Op(OpMatch) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *O = dyn_cast<Operator>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
return O->getOpcode() == Opcode && Op.match(O->getOperand(0));		if (auto O = match_dyn_cast<MatchContext, Operator>(V))
		return O->getOpcode() == Opcode && MContext.acceptInnerNode(O) && Op.match_context(O->getOperand(0), MContext);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return O->getOpcode() == Opcode && MContext.acceptInnerNode(O) && Op.match_context(O->getOperand(0), MContext); + return O->getOpcode() == Opcode && MContext.acceptInnerNode(O) && + Op.match_context(O->getOperand(0), MContext); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return O->getOpcode() == Opcode && MContext.
return false;		return false;
}		}
};		};

/// Matches BitCast.		/// Matches BitCast.
template <typename OpTy>		template <typename OpTy>
inline CastClass_match<OpTy, Instruction::BitCast> m_BitCast(const OpTy &Op) {		inline CastClass_match<OpTy, Instruction::BitCast> m_BitCast(const OpTy &Op) {
return CastClass_match<OpTy, Instruction::BitCast>(Op);		return CastClass_match<OpTy, Instruction::BitCast>(Op);
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
// Matchers for control flow.		// Matchers for control flow.
//		//

struct br_match {		struct br_match {
BasicBlock *&Succ;		BasicBlock *&Succ;

br_match(BasicBlock *&Succ) : Succ(Succ) {}		br_match(BasicBlock *&Succ) : Succ(Succ) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *BI = dyn_cast<BranchInst>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
		if (auto *BI = match_dyn_cast<MatchContext, BranchInst>(V))
if (BI->isUnconditional()) {		if (BI->isUnconditional()) {
Succ = BI->getSuccessor(0);		Succ = BI->getSuccessor(0);
return true;		return true;
}		}
return false;		return false;
}		}
};		};

inline br_match m_UnconditionalBr(BasicBlock *&Succ) { return br_match(Succ); }		inline br_match m_UnconditionalBr(BasicBlock *&Succ) { return br_match(Succ); }

template <typename Cond_t, typename TrueBlock_t, typename FalseBlock_t>		template <typename Cond_t, typename TrueBlock_t, typename FalseBlock_t>
struct brc_match {		struct brc_match {
Cond_t Cond;		Cond_t Cond;
TrueBlock_t T;		TrueBlock_t T;
FalseBlock_t F;		FalseBlock_t F;

brc_match(const Cond_t &C, const TrueBlock_t &t, const FalseBlock_t &f)		brc_match(const Cond_t &C, const TrueBlock_t &t, const FalseBlock_t &f)
: Cond(C), T(t), F(f) {}		: Cond(C), T(t), F(f) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (auto *BI = dyn_cast<BranchInst>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
if (BI->isConditional() && Cond.match(BI->getCondition()))		if (auto *BI = match_dyn_cast<MatchContext, BranchInst>(V))
return T.match(BI->getSuccessor(0)) && F.match(BI->getSuccessor(1));		if (BI->isConditional() && Cond.match(BI->getCondition())) {
		return T.match_context(BI->getSuccessor(0), MContext) && F.match_context(BI->getSuccessor(1), MContext);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return T.match_context(BI->getSuccessor(0), MContext) && F.match_context(BI->getSuccessor(1), MContext); + return T.match_context(BI->getSuccessor(0), MContext) && + F.match_context(BI->getSuccessor(1), MContext); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return T.match_context(BI->getSuccessor(0)…
		}
return false;		return false;
}		}
};		};

template <typename Cond_t>		template <typename Cond_t>
inline brc_match<Cond_t, bind_ty<BasicBlock>, bind_ty<BasicBlock>>		inline brc_match<Cond_t, bind_ty<BasicBlock>, bind_ty<BasicBlock>>
m_Br(const Cond_t &C, BasicBlock &T, BasicBlock &F) {		m_Br(const Cond_t &C, BasicBlock &T, BasicBlock &F) {
return brc_match<Cond_t, bind_ty<BasicBlock>, bind_ty<BasicBlock>>(		return brc_match<Cond_t, bind_ty<BasicBlock>, bind_ty<BasicBlock>>(
Show All 15 Lines
struct MaxMin_match {		struct MaxMin_match {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;

// The evaluation order is always stable, regardless of Commutability.		// The evaluation order is always stable, regardless of Commutability.
// The LHS is always matched first.		// The LHS is always matched first.
MaxMin_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}		MaxMin_match(const LHS_t &LHS, const RHS_t &RHS) : L(LHS), R(RHS) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
// Look for "(x pred y) ? x : y" or "(x pred y) ? y : x".		// Look for "(x pred y) ? x : y" or "(x pred y) ? y : x".
auto *SI = dyn_cast<SelectInst>(V);		auto *SI = match_dyn_cast<MatchContext, SelectInst>(V);
if (!SI)		if (!SI \|\| !MContext.acceptInnerNode(SI))
return false;		return false;
auto *Cmp = dyn_cast<CmpInst_t>(SI->getCondition());		auto *Cmp = match_dyn_cast<MatchContext, CmpInst_t>(SI->getCondition());
if (!Cmp)		if (!Cmp \|\| !MContext.acceptInnerNode(Cmp))
return false;		return false;
// At this point we have a select conditioned on a comparison. Check that		// At this point we have a select conditioned on a comparison. Check that
// it is the values returned by the select that are being compared.		// it is the values returned by the select that are being compared.
Value *TrueVal = SI->getTrueValue();		Value *TrueVal = SI->getTrueValue();
Value *FalseVal = SI->getFalseValue();		Value *FalseVal = SI->getFalseValue();
Value *LHS = Cmp->getOperand(0);		Value *LHS = Cmp->getOperand(0);
Value *RHS = Cmp->getOperand(1);		Value *RHS = Cmp->getOperand(1);
if ((TrueVal != LHS \|\| FalseVal != RHS) &&		if ((TrueVal != LHS \|\| FalseVal != RHS) &&
(TrueVal != RHS \|\| FalseVal != LHS))		(TrueVal != RHS \|\| FalseVal != LHS))
return false;		return false;
typename CmpInst_t::Predicate Pred =		typename CmpInst_t::Predicate Pred =
LHS == TrueVal ? Cmp->getPredicate() : Cmp->getInversePredicate();		LHS == TrueVal ? Cmp->getPredicate() : Cmp->getInversePredicate();
// Does "(x pred y) ? x : y" represent the desired max/min operation?		// Does "(x pred y) ? x : y" represent the desired max/min operation?
if (!Pred_t::match(Pred))		if (!Pred_t::match(Pred))
return false;		return false;

// It does! Bind the operands.		// It does! Bind the operands.
return (L.match(LHS) && R.match(RHS)) \|\|		MatchContext LRContext(MContext);
(Commutable && L.match(RHS) && R.match(LHS));		if (L.match_context(LHS, LRContext) && R.match_context(RHS, LRContext) && MContext.mergeContext(LRContext)) return true;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (L.match_context(LHS, LRContext) && R.match_context(RHS, LRContext) && MContext.mergeContext(LRContext)) return true; - if (Commutable && (L.match_context(RHS, MContext) && R.match_context(LHS, MContext))) return true; + if (L.match_context(LHS, LRContext) && R.match_context(RHS, LRContext) && + MContext.mergeContext(LRContext)) + return true; + if (Commutable && + (L.match_context(RHS, MContext) && R.match_context(LHS, MContext))) + return true; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (L.match_context(LHS, LRContext) && R.
		if (Commutable && (L.match_context(RHS, MContext) && R.match_context(LHS, MContext))) return true;
		return false;
}		}
};		};

/// Helper class for identifying signed max predicates.		/// Helper class for identifying signed max predicates.
struct smax_pred_ty {		struct smax_pred_ty {
static bool match(ICmpInst::Predicate Pred) {		static bool match(ICmpInst::Predicate Pred) {
return Pred == CmpInst::ICMP_SGT \|\| Pred == CmpInst::ICMP_SGE;		return Pred == CmpInst::ICMP_SGT \|\| Pred == CmpInst::ICMP_SGE;
}		}
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
struct UAddWithOverflow_match {		struct UAddWithOverflow_match {
LHS_t L;		LHS_t L;
RHS_t R;		RHS_t R;
Sum_t S;		Sum_t S;

UAddWithOverflow_match(const LHS_t &L, const RHS_t &R, const Sum_t &S)		UAddWithOverflow_match(const LHS_t &L, const RHS_t &R, const Sum_t &S)
: L(L), R(R), S(S) {}		: L(L), R(R), S(S) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
Value ICmpLHS, ICmpRHS;		Value ICmpLHS, ICmpRHS;
ICmpInst::Predicate Pred;		ICmpInst::Predicate Pred;
if (!m_ICmp(Pred, m_Value(ICmpLHS), m_Value(ICmpRHS)).match(V))		if (!m_ICmp(Pred, m_Value(ICmpLHS), m_Value(ICmpRHS)).match(V))
return false;		return false;

Value AddLHS, AddRHS;		Value AddLHS, AddRHS;
auto AddExpr = m_Add(m_Value(AddLHS), m_Value(AddRHS));		auto AddExpr = m_Add(m_Value(AddLHS), m_Value(AddRHS));

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
}		}

template <typename Opnd_t> struct Argument_match {		template <typename Opnd_t> struct Argument_match {
unsigned OpI;		unsigned OpI;
Opnd_t Val;		Opnd_t Val;

Argument_match(unsigned OpIdx, const Opnd_t &V) : OpI(OpIdx), Val(V) {}		Argument_match(unsigned OpIdx, const Opnd_t &V) : OpI(OpIdx), Val(V) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
// FIXME: Should likely be switched to use `CallBase`.		// FIXME: Should likely be switched to use `CallBase`.
if (const auto *CI = dyn_cast<CallInst>(V))		if (const auto *CI = match_dyn_cast<MatchContext, CallInst>(V))
return Val.match(CI->getArgOperand(OpI));		return Val.match(CI->getArgOperand(OpI));
return false;		return false;
}		}
};		};

/// Match an argument.		/// Match an argument.
template <unsigned OpI, typename Opnd_t>		template <unsigned OpI, typename Opnd_t>
inline Argument_match<Opnd_t> m_Argument(const Opnd_t &Op) {		inline Argument_match<Opnd_t> m_Argument(const Opnd_t &Op) {
return Argument_match<Opnd_t>(OpI, Op);		return Argument_match<Opnd_t>(OpI, Op);
}		}

/// Intrinsic matchers.		/// Intrinsic matchers.
struct IntrinsicID_match {		struct IntrinsicID_match {
unsigned ID;		unsigned ID;

IntrinsicID_match(Intrinsic::ID IntrID) : ID(IntrID) {}		IntrinsicID_match(Intrinsic::ID IntrID) : ID(IntrID) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
if (const auto *CI = dyn_cast<CallInst>(V))		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
		if (const auto *CI = match_dyn_cast<MatchContext, CallInst>(V))
if (const auto *F = CI->getCalledFunction())		if (const auto *F = CI->getCalledFunction())
return F->getIntrinsicID() == ID;		return F->getIntrinsicID() == ID;
return false;		return false;
}		}
};		};

/// Intrinsic matches are combinations of ID matchers, and argument		/// Intrinsic matches are combinations of ID matchers, and argument
/// matchers. Higher arity matcher are defined recursively in terms of and-ing		/// matchers. Higher arity matcher are defined recursively in terms of and-ing
▲ Show 20 Lines • Show All 208 Lines • ▼ Show 20 Lines
m_c_FMul(const LHS &L, const RHS &R) {		m_c_FMul(const LHS &L, const RHS &R) {
return BinaryOp_match<LHS, RHS, Instruction::FMul, true>(L, R);		return BinaryOp_match<LHS, RHS, Instruction::FMul, true>(L, R);
}		}

template <typename Opnd_t> struct Signum_match {		template <typename Opnd_t> struct Signum_match {
Opnd_t Val;		Opnd_t Val;
Signum_match(const Opnd_t &V) : Val(V) {}		Signum_match(const Opnd_t &V) : Val(V) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EContext; return match_context(V, EContext); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EContext; return match_context(V, EContext); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EContext; + return match_context(V, EContext); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
unsigned TypeSize = V->getType()->getScalarSizeInBits();		unsigned TypeSize = V->getType()->getScalarSizeInBits();
if (TypeSize == 0)		if (TypeSize == 0)
return false;		return false;

unsigned ShiftWidth = TypeSize - 1;		unsigned ShiftWidth = TypeSize - 1;
Value OpL = nullptr, OpR = nullptr;		Value OpL = nullptr, OpR = nullptr;

// This is the representation of signum we match:		// This is the representation of signum we match:
Show All 23 Lines
template <typename Val_t> inline Signum_match<Val_t> m_Signum(const Val_t &V) {		template <typename Val_t> inline Signum_match<Val_t> m_Signum(const Val_t &V) {
return Signum_match<Val_t>(V);		return Signum_match<Val_t>(V);
}		}

template <int Ind, typename Opnd_t> struct ExtractValue_match {		template <int Ind, typename Opnd_t> struct ExtractValue_match {
Opnd_t Val;		Opnd_t Val;
ExtractValue_match(const Opnd_t &V) : Val(V) {}		ExtractValue_match(const Opnd_t &V) : Val(V) {}

template <typename OpTy> bool match(OpTy *V) {		template <typename OpTy> bool match(OpTy *V) { EmptyContext EC; return match_context(V, EC); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename OpTy> bool match(OpTy V) { EmptyContext EC; return match_context(V, EC); } - template <typename OpTy, typename MatchContext> bool match_context(OpTy V, MatchContext & MContext) { + template <typename OpTy> bool match(OpTy V) { + EmptyContext EC; + return match_context(V, EC); + } + template <typename OpTy, typename MatchContext> + bool match_context(OpTy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename OpTy> bool match(OpTy *V) {…
		template <typename OpTy, typename MatchContext> bool match_context(OpTy *V, MatchContext & MContext) {
if (auto *I = dyn_cast<ExtractValueInst>(V))		if (auto *I = dyn_cast<ExtractValueInst>(V))
return I->getNumIndices() == 1 && I->getIndices()[0] == Ind &&		return I->getNumIndices() == 1 && I->getIndices()[0] == Ind &&
Val.match(I->getAggregateOperand());		Val.match_context(I->getAggregateOperand(), MContext);
return false;		return false;
}		}
};		};

/// Match a single index ExtractValue instruction.		/// Match a single index ExtractValue instruction.
/// For example m_ExtractValue<1>(...)		/// For example m_ExtractValue<1>(...)
template <int Ind, typename Val_t>		template <int Ind, typename Val_t>
inline ExtractValue_match<Ind, Val_t> m_ExtractValue(const Val_t &V) {		inline ExtractValue_match<Ind, Val_t> m_ExtractValue(const Val_t &V) {
Show All 11 Lines	private:
m_OffsetGep(const Base &B, const Offset &O) {		m_OffsetGep(const Base &B, const Offset &O) {
return BinaryOp_match<Base, Offset, Instruction::GetElementPtr>(B, O);		return BinaryOp_match<Base, Offset, Instruction::GetElementPtr>(B, O);
}		}

public:		public:
const DataLayout &DL;		const DataLayout &DL;
VScaleVal_match(const DataLayout &DL) : DL(DL) {}		VScaleVal_match(const DataLayout &DL) : DL(DL) {}

template <typename ITy> bool match(ITy *V) {		template <typename ITy> bool match(ITy *V) { EmptyContext EC; return match_context(V, EC); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template <typename ITy> bool match(ITy V) { EmptyContext EC; return match_context(V, EC); } - template <typename ITy, typename MatchContext> bool match_context(ITy V, MatchContext & MContext) { + template <typename ITy> bool match(ITy V) { + EmptyContext EC; + return match_context(V, EC); + } + template <typename ITy, typename MatchContext> + bool match_context(ITy V, MatchContext &MContext) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - template <typename ITy> bool match(ITy *V) {…
		template <typename ITy, typename MatchContext> bool match_context(ITy *V, MatchContext & MContext) {
if (m_Intrinsic<Intrinsic::vscale>().match(V))		if (m_Intrinsic<Intrinsic::vscale>().match(V))
return true;		return true;

if (m_PtrToInt(m_OffsetGep(m_Zero(), m_SpecificInt(1))).match(V)) {		if (m_PtrToInt(m_OffsetGep(m_Zero(), m_SpecificInt(1))).match(V)) {
Type *PtrTy = cast<Operator>(V)->getOperand(0)->getType();		Type *PtrTy = cast<Operator>(V)->getOperand(0)->getType();
Type *DerefTy = PtrTy->getPointerElementType();		Type *DerefTy = PtrTy->getPointerElementType();
if (DerefTy->isVectorTy() && DerefTy->getVectorIsScalable() &&		if (DerefTy->isVectorTy() && DerefTy->getVectorIsScalable() &&
DL.getTypeAllocSizeInBits(DerefTy).getKnownMinSize() == 8)		DL.getTypeAllocSizeInBits(DerefTy).getKnownMinSize() == 8)
Show All 15 Lines

llvm/include/llvm/IR/PredicatedInst.h

This file was added.

				//===-- llvm/PredicatedInst.h - Predication utility subclass --- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines various classes for working with predicated instructions.
				// Predicated instructions are either regular instructions or calls to
				// Vector Predication (VP) intrinsics that have a mask and an explicit
				// vector length argument.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_IR_PREDICATEDINST_H
				#define LLVM_IR_PREDICATEDINST_H

				#include "llvm/ADT/None.h"
				#include "llvm/ADT/Optional.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/Instruction.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/MatcherCast.h"
				#include "llvm/IR/Operator.h"
				#include "llvm/IR/Type.h"
				#include "llvm/IR/Value.h"
				#include "llvm/Support/Casting.h"

				#include <cstddef>

				namespace llvm {

				class BasicBlock;

				class PredicatedInstruction : public User {
				public:
				// The PredicatedInstruction class is intended to be used as a utility, and is
				// never itself instantiated.
				PredicatedInstruction() = delete;
				~PredicatedInstruction() = delete;

				void copyIRFlags(const Value *V, bool IncludeWrapFlags) {
				cast<Instruction>(this)->copyIRFlags(V, IncludeWrapFlags);
				}

				BasicBlock *getParent() { return cast<Instruction>(this)->getParent(); }
				const BasicBlock *getParent() const {
				return cast<const Instruction>(this)->getParent();
				}

				void *operator new(size_t s) = delete;

				Value *getMaskParam() const {
				auto thisVP = dyn_cast<VPIntrinsic>(this);
				if (!thisVP)
				return nullptr;
				return thisVP->getMaskParam();
				}

				Value *getVectorLengthParam() const {
				auto thisVP = dyn_cast<VPIntrinsic>(this);
				if (!thisVP)
				return nullptr;
				return thisVP->getVectorLengthParam();
				}

				/// \returns True if the passed vector length value has no predicating effect
				/// on the op.
				bool canIgnoreVectorLengthParam() const;

				/// \return True if the static operator of this instruction has a mask or
				/// vector length parameter.
				bool isVectorPredicatedOp() const { return isa<VPIntrinsic>(this); }

				/// \returns the effective Opcode of this operation (ignoring the mask and
				/// vector length param).
				unsigned getOpcode() const {
				auto *VPInst = dyn_cast<VPIntrinsic>(this);

				if (!VPInst) {
				return cast<Instruction>(this)->getOpcode();
				}

				return VPInst->getFunctionalOpcode();
				}

				bool isVectorReduction() const;

				static bool classof(const Instruction *I) { return isa<Instruction>(I); }
				static bool classof(const ConstantExpr *CE) { return false; }
				static bool classof(const Value *V) { return isa<Instruction>(V); }

				/// Convenience function for getting all the fast-math flags, which must be an
				/// operator which supports these flags. See LangRef.html for the meaning of
				/// these flags.
				FastMathFlags getFastMathFlags() const;
				};

				class PredicatedOperator : public User {
				public:
				// The PredicatedOperator class is intended to be used as a utility, and is
				// never itself instantiated.
				PredicatedOperator() = delete;
				~PredicatedOperator() = delete;

				void *operator new(size_t s) = delete;

				/// Return the opcode for this Instruction or ConstantExpr.
				unsigned getOpcode() const {
				auto *VPInst = dyn_cast<VPIntrinsic>(this);

				// Conceal the fp operation if it has non-default rounding mode or exception
				// behavior
				if (VPInst && !VPInst->isConstrainedOp()) {
				return VPInst->getFunctionalOpcode();
				}

				if (const Instruction *I = dyn_cast<Instruction>(this))
				return I->getOpcode();

				return cast<ConstantExpr>(this)->getOpcode();
				}

				Value *getMask() const {
				auto thisVP = dyn_cast<VPIntrinsic>(this);
				if (!thisVP)
				return nullptr;
				return thisVP->getMaskParam();
				}

				Value *getVectorLength() const {
				auto thisVP = dyn_cast<VPIntrinsic>(this);
				if (!thisVP)
				return nullptr;
				return thisVP->getVectorLengthParam();
				}

				void copyIRFlags(const Value *V, bool IncludeWrapFlags = true);
				FastMathFlags getFastMathFlags() const {
				auto *I = dyn_cast<Instruction>(this);
				if (I)
				return I->getFastMathFlags();
				else
				return FastMathFlags();
				}

				static bool classof(const Instruction *I) {
				return isa<VPIntrinsic>(I) \|\| isa<Operator>(I);
				}
				static bool classof(const ConstantExpr *CE) { return isa<Operator>(CE); }
				static bool classof(const Value *V) {
				return isa<VPIntrinsic>(V) \|\| isa<Operator>(V);
				}
				};

				class PredicatedBinaryOperator : public PredicatedOperator {
				public:
				// The PredicatedBinaryOperator class is intended to be used as a utility, and
				// is never itself instantiated.
				PredicatedBinaryOperator() = delete;
				~PredicatedBinaryOperator() = delete;

				using BinaryOps = Instruction::BinaryOps;

				void *operator new(size_t s) = delete;

				static bool classof(const Instruction *I) {
				if (isa<BinaryOperator>(I))
				return true;
				auto VPInst = dyn_cast<VPIntrinsic>(I);
				return VPInst && VPInst->isBinaryOp();
				}
				static bool classof(const ConstantExpr *CE) {
				return isa<BinaryOperator>(CE);
				}
				static bool classof(const Value *V) {
				auto *I = dyn_cast<Instruction>(V);
				if (I && classof(I))
				return true;
				auto *CE = dyn_cast<ConstantExpr>(V);
				return CE && classof(CE);
				}

				/// Construct a predicated binary instruction, given the opcode and the two
				/// operands.
				static Instruction Create(Module Mod, Value Mask, Value VectorLen,
				Instruction::BinaryOps Opc, Value V1, Value V2,
				const Twine &Name, BasicBlock *InsertAtEnd,
				Instruction *InsertBefore);

				static Instruction Create(Module Mod, Value Mask, Value VectorLen,
				BinaryOps Opc, Value V1, Value V2,
				const Twine &Name = Twine(),
				Instruction *InsertBefore = nullptr) {
				return Create(Mod, Mask, VectorLen, Opc, V1, V2, Name, nullptr,
				InsertBefore);
				}

				static Instruction Create(Module Mod, Value Mask, Value VectorLen,
				BinaryOps Opc, Value V1, Value V2,
				const Twine &Name, BasicBlock *InsertAtEnd) {
				return Create(Mod, Mask, VectorLen, Opc, V1, V2, Name, InsertAtEnd,
				nullptr);
				}

				static Instruction CreateWithCopiedFlags(Module Mod, Value *Mask,
				Value *VectorLen, BinaryOps Opc,
				Value V1, Value V2,
				Instruction *CopyBO,
				const Twine &Name = "") {
				Instruction *BO =
				Create(Mod, Mask, VectorLen, Opc, V1, V2, Name, nullptr, nullptr);
				BO->copyIRFlags(CopyBO);
				return BO;
				}
				};

				class PredicatedICmpInst : public PredicatedBinaryOperator {
				public:
				// The Operator class is intended to be used as a utility, and is never itself
				// instantiated.
				PredicatedICmpInst() = delete;
				~PredicatedICmpInst() = delete;

				void *operator new(size_t s) = delete;

				static bool classof(const Instruction *I) {
				if (isa<ICmpInst>(I))
				return true;
				auto VPInst = dyn_cast<VPIntrinsic>(I);
				return VPInst && VPInst->getFunctionalOpcode() == Instruction::ICmp;
				}
				static bool classof(const ConstantExpr *CE) {
				return CE->getOpcode() == Instruction::ICmp;
				}
				static bool classof(const Value *V) {
				auto *I = dyn_cast<Instruction>(V);
				if (I && classof(I))
				return true;
				auto *CE = dyn_cast<ConstantExpr>(V);
				return CE && classof(CE);
				}

				ICmpInst::Predicate getPredicate() const {
				auto *ICInst = dyn_cast<const ICmpInst>(this);
				if (ICInst)
				return ICInst->getPredicate();
				auto *CE = dyn_cast<const ConstantExpr>(this);
				if (CE)
				return static_cast<ICmpInst::Predicate>(CE->getPredicate());
				return static_cast<ICmpInst::Predicate>(
				cast<VPIntrinsic>(this)->getCmpPredicate());
				}
				};

				class PredicatedFCmpInst : public PredicatedBinaryOperator {
				public:
				// The Operator class is intended to be used as a utility, and is never itself
				// instantiated.
				PredicatedFCmpInst() = delete;
				~PredicatedFCmpInst() = delete;

				void *operator new(size_t s) = delete;

				static bool classof(const Instruction *I) {
				if (isa<FCmpInst>(I))
				return true;
				auto VPInst = dyn_cast<VPIntrinsic>(I);
				return VPInst && VPInst->getFunctionalOpcode() == Instruction::FCmp;
				}
				static bool classof(const ConstantExpr *CE) {
				return CE->getOpcode() == Instruction::FCmp;
				}
				static bool classof(const Value *V) {
				auto *I = dyn_cast<Instruction>(V);
				if (I && classof(I))
				return true;
				return isa<ConstantExpr>(V);
				}

				FCmpInst::Predicate getPredicate() const {
				auto *FCInst = dyn_cast<const FCmpInst>(this);
				if (FCInst)
				return FCInst->getPredicate();
				auto *CE = dyn_cast<const ConstantExpr>(this);
				if (CE)
				return static_cast<FCmpInst::Predicate>(CE->getPredicate());
				return static_cast<FCmpInst::Predicate>(
				cast<VPIntrinsic>(this)->getCmpPredicate());
				}
				};

				class PredicatedSelectInst : public PredicatedOperator {
				public:
				// The Operator class is intended to be used as a utility, and is never itself
				// instantiated.
				PredicatedSelectInst() = delete;
				~PredicatedSelectInst() = delete;

				void *operator new(size_t s) = delete;

				static bool classof(const Instruction *I) {
				if (isa<SelectInst>(I))
				return true;
				auto VPInst = dyn_cast<VPIntrinsic>(I);
				return VPInst && VPInst->getFunctionalOpcode() == Instruction::Select;
				}
				static bool classof(const ConstantExpr *CE) {
				return CE->getOpcode() == Instruction::Select;
				}
				static bool classof(const Value *V) {
				auto *I = dyn_cast<Instruction>(V);
				if (I && classof(I))
				return true;
				auto *CE = dyn_cast<ConstantExpr>(V);
				return CE && CE->getOpcode() == Instruction::Select;
				}

				const Value *getCondition() const { return getOperand(0); }
				const Value *getTrueValue() const { return getOperand(1); }
				const Value *getFalseValue() const { return getOperand(2); }
				Value *getCondition() { return getOperand(0); }
				Value *getTrueValue() { return getOperand(1); }
				Value *getFalseValue() { return getOperand(2); }

				void setCondition(Value *V) { setOperand(0, V); }
				void setTrueValue(Value *V) { setOperand(1, V); }
				void setFalseValue(Value *V) { setOperand(2, V); }
				};

				namespace PatternMatch {

				// PredicatedMatchContext for pattern matching
				struct PredicatedContext {
				Value *Mask;
				Value *VectorLength;
				Module *Mod;

				void reset(Value *V) {
				auto *PI = dyn_cast<PredicatedInstruction>(V);
				if (!PI) {
				VectorLength = nullptr;
				Mask = nullptr;
				return;
				}
				VectorLength = PI->getVectorLengthParam();
				Mask = PI->getMaskParam();

				if (Mod) return;

				// try to get a hold of the Module
				auto *BB = PI->getParent();
				if (BB) {
				auto *Func = BB->getParent();
				if (Func) {
				Mod = Func->getParent();
				}
				}

				if (Mod) return;

				// try to infer the module from a call
				auto CallI = dyn_cast<CallInst>(V);
				if (CallI && CallI->getCalledFunction()) {
				Mod = CallI->getCalledFunction()->getParent();
				}
				}

				PredicatedContext(Value *Val)
				: Mask(nullptr), VectorLength(nullptr), Mod(nullptr) {
				reset(Val);
				}

				PredicatedContext(const PredicatedContext &PC)
				: Mask(PC.Mask), VectorLength(PC.VectorLength), Mod(PC.Mod) {}

				/// accept a match where \p Val is in a non-leaf position in a match pattern
				bool acceptInnerNode(const Value *Val) const {
				auto PredI = dyn_cast<PredicatedInstruction>(Val);
				if (!PredI)
				return VectorLength == nullptr && Mask == nullptr;
				return VectorLength == PredI->getVectorLengthParam() &&
				Mask == PredI->getMaskParam();
				}

				/// accept a match where \p Val is bound to a free variable.
				bool acceptBoundNode(const Value *Val) const { return true; }

				/// whether this context is compatiable with \p E.
				bool acceptContext(PredicatedContext PC) const {
				return std::tie(PC.Mask, PC.VectorLength) == std::tie(Mask, VectorLength);
				}

				/// merge the context \p E into this context and return whether the resulting
				/// context is valid.
				bool mergeContext(PredicatedContext PC) const { return acceptContext(PC); }

				/// match \p P in a new contesx for \p Val.
				template <typename Val, typename Pattern>
				bool reset_match(Val *V, const Pattern &P) {
				reset(V);
				return const_cast<Pattern &>(P).match_context(V, *this);
				}

				/// match \p P in the current context.
				template <typename Val, typename Pattern>
				bool try_match(Val *V, const Pattern &P) {
				PredicatedContext SubContext(*this);
				return const_cast<Pattern &>(P).match_context(V, SubContext);
				}
				};

				struct PredicatedContext;
				template <> struct MatcherCast<PredicatedContext, BinaryOperator> {
				using ActualCastType = PredicatedBinaryOperator;
				};
				template <> struct MatcherCast<PredicatedContext, Operator> {
				using ActualCastType = PredicatedOperator;
				};
				template <> struct MatcherCast<PredicatedContext, ICmpInst> {
				using ActualCastType = PredicatedICmpInst;
				};
				template <> struct MatcherCast<PredicatedContext, FCmpInst> {
				using ActualCastType = PredicatedFCmpInst;
				};
				template <> struct MatcherCast<PredicatedContext, SelectInst> {
				using ActualCastType = PredicatedSelectInst;
				};
				template <> struct MatcherCast<PredicatedContext, Instruction> {
				using ActualCastType = PredicatedInstruction;
				};

				} // namespace PatternMatch

				} // namespace llvm

				#endif // LLVM_IR_PREDICATEDINST_H

llvm/include/llvm/IR/VPBuilder.h

This file was added.

				#ifndef LLVM_IR_VPBUILDER_H
				#define LLVM_IR_VPBUILDER_H

				#include <llvm/IR/IRBuilder.h>
				#include <llvm/IR/Value.h>
				#include <llvm/IR/Instruction.h>
				#include <llvm/IR/InstrTypes.h>
				#include <llvm/IR/PredicatedInst.h>
				#include <llvm/IR/PatternMatch.h>

				namespace llvm {

				using ValArray = ArrayRef<Value*>;

				class VPBuilder {
				IRBuilder<> & Builder;

				// Explicit mask parameter
				Value * Mask;
				// Explicit vector length parameter
				Value * ExplicitVectorLength;
				// Compile-time vector length
				int StaticVectorLength;

				// get a valid mask/evl argument for the current predication contet
				Value& RequestPred();
				Value& RequestEVL();

				public:
				VPBuilder(IRBuilder<> & _builder)
				: Builder(_builder)
				, Mask(nullptr)
				, ExplicitVectorLength(nullptr)
				, StaticVectorLength(-1)
				{}

				Module & getModule() const;
				LLVMContext & getContext() const { return Builder.getContext(); }

				// The cannonical vector type for this \p ElementTy
				VectorType& getVectorType(Type &ElementTy);

				// Predication context tracker
				VPBuilder& setMask(Value * _Mask) { Mask = _Mask; return *this; }
				VPBuilder& setEVL(Value * _ExplicitVectorLength) { ExplicitVectorLength = _ExplicitVectorLength; return *this; }
				VPBuilder& setStaticVL(int VLen) { StaticVectorLength = VLen; return *this; }

				// Create a map-vectorized copy of the instruction \p Inst with the underlying IRBuilder instance.
				// This operation may return nullptr if the instruction could not be vectorized.
				Value* CreateVectorCopy(Instruction & Inst, ValArray VecOpArray);

				// shift the elements in \p SrcVal by Amount where the result lane is true.
				Value* CreateVectorShift(Value SrcVal, Value Amount, Twine Name="");

				// Memory
				Value& CreateContiguousStore(Value & Val, Value & Pointer, MaybeAlign Alignment);
				Value& CreateContiguousLoad(Value & Pointer, MaybeAlign Alignment);
				Value& CreateScatter(Value & Val, Value & PointerVec, MaybeAlign Alignment);
				Value& CreateGather(Value & PointerVec, MaybeAlign Alignment);
				};





				namespace PatternMatch {
				// Factory class to generate instructions in a context
				template<typename MatcherContext>
				class MatchContextBuilder {
				public:
				// MatchContextBuilder(MatcherContext MC);
				};


				// Context-free instruction builder
				template<>
				class MatchContextBuilder<EmptyContext> {
				public:
				MatchContextBuilder(EmptyContext & EC) {}

				#define HANDLE_BINARY_INST(N, OPC, CLASS) \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name = "") const {\
				return BinaryOperator::Create(Instruction::OPC, V1, V2, Name);\
				} \
				template<typename IRBuilderType> \
				Instruction Create##OPC(IRBuilderType & Builder, Value V1, Value *V2, \
				const Twine &Name = "") const { \
				auto * Inst = BinaryOperator::Create(Instruction::OPC, V1, V2, Name); \
				Builder.Insert(Inst); return Inst; \
				} \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name, BasicBlock *BB) const {\
				return BinaryOperator::Create(Instruction::OPC, V1, V2, Name, BB);\
				} \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name, Instruction *I) const {\
				return BinaryOperator::Create(Instruction::OPC, V1, V2, Name, I);\
				} \
				Instruction Create##OPC##FMF(Value V1, Value V2, Instruction FMFSource, \
				const Twine &Name = "") const {\
				return BinaryOperator::CreateWithCopiedFlags(Instruction::OPC, V1, V2, FMFSource, Name);\
				} \
				template<typename IRBuilderType> \
				Instruction Create##OPC##FMF(IRBuilderType& Builder, Value V1, Value V2, Instruction FMFSource, \
				const Twine &Name = "") const {\
				auto * Inst = BinaryOperator::CreateWithCopiedFlags(Instruction::OPC, V1, V2, FMFSource, Name);\
				Builder.Insert(Inst); return Inst; \
				}
				#include "llvm/IR/Instruction.def"
				#undef HANDLE_BINARY_INST

				BinaryOperator CreateFNegFMF(Value Op, Instruction *FMFSource,
				const Twine &Name = "") {
				Value *Zero = ConstantFP::getNegativeZero(Op->getType());
				return BinaryOperator::CreateWithCopiedFlags(Instruction::FSub, Zero, Op, FMFSource, Name);
				}

				template<typename IRBuilderType>
				Value CreateFPTrunc(IRBuilderType & Builder, Value V, Type *DestTy, const Twine & Name = Twine()) { return Builder.CreateFPTrunc(V, DestTy, Name); }
				template<typename IRBuilderType>
				Value CreateFPExt(IRBuilderType & Builder, Value V, Type *DestTy, const Twine & Name = Twine()) { return Builder.CreateFPExt(V, DestTy, Name); }
				};



				// Context-free instruction builder
				template<>
				class MatchContextBuilder<PredicatedContext> {
				PredicatedContext & PC;
				public:
				MatchContextBuilder(PredicatedContext & PC) : PC(PC) {}

				#define HANDLE_BINARY_INST(N, OPC, CLASS) \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name = "") const {\
				return PredicatedBinaryOperator::Create(PC.Mod, PC.Mask, PC.VectorLength, Instruction::OPC, V1, V2, Name);\
				} \
				template<typename IRBuilderType> \
				Instruction Create##OPC(IRBuilderType & Builder, Value V1, Value *V2, \
				const Twine &Name = "") const {\
				auto * PredInst = Create##OPC(V1, V2, Name); \
				Builder.Insert(PredInst); \
				return PredInst; \
				} \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name, BasicBlock *BB) const {\
				return PredicatedBinaryOperator::Create(PC.Mod, PC.Mask, PC.VectorLength, Instruction::OPC, V1, V2, Name, BB);\
				} \
				Instruction Create##OPC(Value V1, Value *V2, \
				const Twine &Name, Instruction *I) const {\
				return PredicatedBinaryOperator::Create(PC.Mod, PC.Mask, PC.VectorLength, Instruction::OPC, V1, V2, Name, I);\
				} \
				Instruction Create##OPC##FMF(Value V1, Value V2, Instruction FMFSource, \
				const Twine &Name = "") const {\
				return PredicatedBinaryOperator::CreateWithCopiedFlags(PC.Mod, PC.Mask, PC.VectorLength, Instruction::OPC, V1, V2, FMFSource, Name);\
				} \
				template<typename IRBuilderType> \
				Instruction Create##OPC##FMF(IRBuilderType& Builder, Value V1, Value V2, Instruction FMFSource, \
				const Twine &Name = "") const {\
				auto * Inst = PredicatedBinaryOperator::CreateWithCopiedFlags(PC.Mod, PC.Mask, PC.VectorLength, Instruction::OPC, V1, V2, FMFSource, Name);\
				Builder.Insert(Inst); return Inst; \
				}
				#include "llvm/IR/Instruction.def"
				#undef HANDLE_BINARY_INST

				Instruction CreateFNegFMF(Value Op, Instruction *FMFSource,
				const Twine &Name = "") {
				Value *Zero = ConstantFP::getNegativeZero(Op->getType());
				return PredicatedBinaryOperator::CreateWithCopiedFlags(PC.Mod, PC.Mask, PC.VectorLength, Instruction::FSub, Zero, Op, FMFSource, Name);
				}

				// TODO predicated casts
				template<typename IRBuilderType>
				Value CreateFPTrunc(IRBuilderType & Builder, Value V, Type *DestTy, const Twine & Name = Twine()) { return Builder.CreateFPTrunc(V, DestTy, Name); }
				template<typename IRBuilderType>
				Value CreateFPExt(IRBuilderType & Builder, Value V, Type *DestTy, const Twine & Name = Twine()) { return Builder.CreateFPExt(V, DestTy, Name); }
				};

				}

				} // namespace llvm

				#endif // LLVM_IR_VPBUILDER_H

llvm/include/llvm/IR/VPIntrinsics.def

This file was added.

				//===-- IR/VPIntrinsics.def - Describes llvm.vp.* Intrinsics -- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains descriptions of the various Vector Predication intrinsics.
				// This is used as a central place for enumerating the different instructions
				// and should eventually be the place to put comments about the instructions.
				//
				//===----------------------------------------------------------------------===//

				// NOTE: NO INCLUDE GUARD DESIRED!

				// Provide definitions of macros so that users of this file do not have to
				// define everything to use it...
				//
				#ifndef REGISTER_VP_INTRINSIC
				#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS)
				#endif

				// This is a reduction intrinsic with accumulator arg at ACCUPOS, reduced vector
				// arg at VECTORPOS.
				#ifndef HANDLE_VP_REDUCTION
				#define HANDLE_VP_REDUCTION(VPID, ACCUPOS, VECTORPOS)
				#endif

				// The intrinsic VPID of llvm.vp.* functionally corresponds to the intrinsic
				// CFPID of llvm.experimental.constrained.*.
				#ifndef HANDLE_VP_TO_CONSTRAINED_INTRIN
				#define HANDLE_VP_TO_CONSTRAINED_INTRIN(VPID, CPFID)
				#endif

				// This VP intrinsic has constraint fp params.
				// Rounding mode arg pos is ROUNDPOS, exception behavior arg pos is EXCEPT POS.
				#ifndef HANDLE_VP_FPCONSTRAINT
				#define HANDLE_VP_FPCONSTRAINT(VPID, ROUNDPOS, EXCEPTPOS)
				#endif

				// Map this VP intrinsic to its functional Opcode
				#ifndef HANDLE_VP_TO_OC
				#define HANDLE_VP_TO_OC(VPID, OC)
				#endif

				// Map this VP intrinsic to its cannonical functional intrinsic.
				#ifndef HANDLE_VP_TO_INTRIN
				#define HANDLE_VP_TO_INTRIN(VPID, ID)
				#endif

				// This VP Intrinsic is a unary operator
				// (only count data params)
				#ifndef HANDLE_VP_IS_UNARY
				#define HANDLE_VP_IS_UNARY(VPID)
				#endif

				// This VP Intrinsic is a binary operator
				// (only count data params)
				#ifndef HANDLE_VP_IS_BINARY
				#define HANDLE_VP_IS_BINARY(VPID)
				#endif

				// This VP Intrinsic is a ternary operator
				// (only count data params)
				#ifndef HANDLE_VP_IS_TERNARY
				#define HANDLE_VP_IS_TERNARY(VPID)
				#endif

				// This VP Intrinsic is a comparison
				// (only count data params)
				#ifndef HANDLE_VP_IS_XCMP
				#define HANDLE_VP_IS_XCMP(VPID)
				#endif

				// This VP Intrinsic is a memory operation
				// The pointer arg is at POINTERPOS and the data arg is at DATAPOS.
				#ifndef HANDLE_VP_IS_MEMOP
				#define HANDLE_VP_IS_MEMOP(VPID, POINTERPOS, DATAPOS)
				#endif

				#ifndef REGISTER_VP_SDNODE
				#define REGISTER_VP_SDNODE(NODEID, MASKPOS, VLENPOS, NAME)
				#endif

				/// This VP Intrinsic lowers to this VP SDNode.
				#ifndef HANDLE_VP_TO_SDNODE
				#define HANDLE_VP_TO_SDNODE(VPID,NODEID)
				#endif

				///// Integer Arithmetic /////

				// llvm.vp.add(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_add, 2, 3)
				HANDLE_VP_TO_OC(vp_add, Add)
				HANDLE_VP_IS_BINARY(vp_add)
				REGISTER_VP_SDNODE(VP_ADD,"vp_add", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_add,VP_ADD)

				// llvm.vp.and(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_and, 2, 3)
				HANDLE_VP_TO_OC(vp_and, And)
				HANDLE_VP_IS_BINARY(vp_and)
				REGISTER_VP_SDNODE(VP_AND,"vp_and", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_and,VP_AND)

				// llvm.vp.ashr(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_ashr, 2, 3)
				HANDLE_VP_TO_OC(vp_ashr, AShr)
				HANDLE_VP_IS_BINARY(vp_ashr)
				REGISTER_VP_SDNODE(VP_SRA,"vp_sra", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_ashr,VP_SRA)

				// llvm.vp.lshr(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_lshr, 2, 3)
				HANDLE_VP_TO_OC(vp_lshr, LShr)
				HANDLE_VP_IS_BINARY(vp_lshr)
				REGISTER_VP_SDNODE(VP_SRL,"vp_srl", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_lshr,VP_SRL)

				// llvm.vp.mul(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_mul, 2, 3)
				HANDLE_VP_TO_OC(vp_mul, Mul)
				HANDLE_VP_IS_BINARY(vp_mul)
				REGISTER_VP_SDNODE(VP_MUL,"vp_mul", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_mul,VP_MUL)

				// llvm.vp.or(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_or, 2, 3)
				HANDLE_VP_TO_OC(vp_or, Or)
				HANDLE_VP_IS_BINARY(vp_or)
				REGISTER_VP_SDNODE(VP_OR,"vp_or", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_or,VP_OR)

				// llvm.vp.sdiv(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_sdiv, 2, 3)
				HANDLE_VP_TO_OC(vp_sdiv, SDiv)
				HANDLE_VP_IS_BINARY(vp_sdiv)
				REGISTER_VP_SDNODE(VP_SDIV,"vp_sdiv", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_sdiv,VP_SDIV)

				// llvm.vp.shl(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_shl, 2, 3)
				HANDLE_VP_TO_OC(vp_shl, Shl)
				HANDLE_VP_IS_BINARY(vp_shl)
				REGISTER_VP_SDNODE(VP_SHL,"vp_shl", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_shl,VP_SHL)

				// llvm.vp.srem(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_srem, 2, 3)
				HANDLE_VP_TO_OC(vp_srem, SRem)
				HANDLE_VP_IS_BINARY(vp_srem)
				REGISTER_VP_SDNODE(VP_SREM,"vp_srem", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_srem,VP_SREM)

				// llvm.vp.sub(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_sub, 2, 3)
				HANDLE_VP_TO_OC(vp_sub, Sub)
				HANDLE_VP_IS_BINARY(vp_sub)
				REGISTER_VP_SDNODE(VP_SUB,"vp_sub", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_sub,VP_SUB)

				// llvm.vp.udiv(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_udiv, 2, 3)
				HANDLE_VP_TO_OC(vp_udiv, UDiv)
				HANDLE_VP_IS_BINARY(vp_udiv)
				REGISTER_VP_SDNODE(VP_UDIV,"vp_udiv", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_udiv,VP_UDIV)

				// llvm.vp.urem(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_urem, 2, 3)
				HANDLE_VP_TO_OC(vp_urem, URem)
				HANDLE_VP_IS_BINARY(vp_urem)
				REGISTER_VP_SDNODE(VP_UREM,"vp_urem", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_urem,VP_UREM)

				// llvm.vp.xor(x,y,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_xor, 2, 3)
				HANDLE_VP_TO_OC(vp_xor, Xor)
				HANDLE_VP_IS_BINARY(vp_xor)
				REGISTER_VP_SDNODE(VP_XOR,"vp_xor", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_xor,VP_XOR)

				///// FP Arithmetic /////

				// llvm.vp.fadd(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fadd, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_fadd, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fadd, experimental_constrained_fadd)
				HANDLE_VP_TO_OC(vp_fadd, FAdd)
				HANDLE_VP_IS_BINARY(vp_fadd)
				REGISTER_VP_SDNODE(VP_FADD,"vp_fadd", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_fadd,VP_FADD)

				// llvm.vp.fdiv(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fdiv, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_fdiv, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fdiv, experimental_constrained_fdiv)
				HANDLE_VP_TO_OC(vp_fdiv, FDiv)
				HANDLE_VP_IS_BINARY(vp_fdiv)
				REGISTER_VP_SDNODE(VP_FDIV,"vp_fdiv", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_fdiv,VP_FDIV)

				// llvm.vp.fmul(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fmul, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_fmul, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fmul, experimental_constrained_fmul)
				HANDLE_VP_TO_OC(vp_fmul, FMul)
				HANDLE_VP_IS_BINARY(vp_fmul)
				REGISTER_VP_SDNODE(VP_FMUL,"vp_fmul", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_fmul,VP_FMUL)

				// llvm.vp.fneg(x,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fneg, 2, 3)
				HANDLE_VP_FPCONSTRAINT(vp_fneg, None, 1)
				HANDLE_VP_TO_OC(vp_fneg, FNeg)
				HANDLE_VP_IS_UNARY(vp_fneg)
				REGISTER_VP_SDNODE(VP_FNEG, "vp_fneg", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_fneg,VP_FNEG)

				// llvm.vp.frem(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_frem, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_frem, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_frem, experimental_constrained_frem)
				HANDLE_VP_TO_OC(vp_frem, FRem)
				HANDLE_VP_IS_BINARY(vp_frem)
				REGISTER_VP_SDNODE(VP_FREM, "vp_frem", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_frem,VP_FREM)

				// llvm.vp.fsub(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fsub, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_fsub, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fsub, experimental_constrained_fsub)
				HANDLE_VP_TO_OC(vp_fsub, FSub)
				HANDLE_VP_IS_BINARY(vp_fsub)
				REGISTER_VP_SDNODE(VP_FSUB, "vp_fsub", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_fsub,VP_FSUB)

				// llvm.vp.fma(x,y,z.round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fma, 5, 6)
				HANDLE_VP_FPCONSTRAINT(vp_fma, 3, 4)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fma, experimental_constrained_fma)
				HANDLE_VP_TO_INTRIN(vp_fma, fma)
				HANDLE_VP_IS_TERNARY(vp_fma)
				REGISTER_VP_SDNODE(VP_FMA, "vp_fma", 5, 6)
				HANDLE_VP_TO_SDNODE(vp_fma,VP_FMA)

				///// Cast, Extend & Round /////

				// llvm.vp.ceil(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_ceil, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_ceil, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_ceil, experimental_constrained_ceil)
				HANDLE_VP_TO_INTRIN(vp_ceil, ceil)
				REGISTER_VP_SDNODE(VP_FCEIL, "vp_fceil", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_ceil,VP_FCEIL)

				// llvm.vp.trunc(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_trunc, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_trunc, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_trunc, experimental_constrained_trunc)
				HANDLE_VP_TO_INTRIN(vp_trunc, trunc)
				REGISTER_VP_SDNODE(VP_FTRUNC, "vp_ftrunc", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_trunc,VP_FTRUNC)

				// llvm.vp.floor(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_floor, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_floor, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_floor, experimental_constrained_floor)
				HANDLE_VP_TO_INTRIN(vp_floor, floor)
				REGISTER_VP_SDNODE(VP_FFLOOR, "vp_ffloor", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_floor,VP_FFLOOR)

				// llvm.vp.fpext(x,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fpext, 2, 3)
				HANDLE_VP_FPCONSTRAINT(vp_fpext, None, 1)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fpext, experimental_constrained_fpext)
				HANDLE_VP_TO_OC(vp_fpext, FPExt)
				REGISTER_VP_SDNODE(VP_FP_EXTEND, "vp_fpext", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_fpext,VP_FP_EXTEND)

				// llvm.vp.fptrunc(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fptrunc, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_fptrunc, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fptrunc, experimental_constrained_fptrunc)
				HANDLE_VP_TO_OC(vp_fptrunc, FPTrunc)
				HANDLE_VP_TO_SDNODE(vp_fptrunc,VP_FTRUNC)

				// llvm.vp.fptoui(x,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fptoui, 2, 3)
				HANDLE_VP_FPCONSTRAINT(vp_fptoui, None, 1)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fptoui, experimental_constrained_fptoui)
				HANDLE_VP_TO_OC(vp_fptoui, FPToUI)
				REGISTER_VP_SDNODE(VP_FP_TO_UINT, "vp_fp_to_uint", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_fptoui,VP_FP_TO_UINT)

				// llvm.vp.fptosi(x,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fptosi, 2, 3)
				HANDLE_VP_FPCONSTRAINT(vp_fptosi, None, 1)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_fptosi, experimental_constrained_fptosi)
				HANDLE_VP_TO_OC(vp_fptosi, FPToSI)
				REGISTER_VP_SDNODE(VP_FP_TO_SINT, "vp_fp_to_sint", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_fptosi,VP_FP_TO_SINT)

				// llvm.vp.uitofp(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_uitofp, 3, 4)
				HANDLE_VP_TO_OC(vp_uitofp, UIToFP)
				HANDLE_VP_FPCONSTRAINT(vp_uitofp, 1, 2)
				REGISTER_VP_SDNODE(VP_UINT_TO_FP, "vp_uint_to_fp", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_uitofp,VP_UINT_TO_FP)
				// HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_uitofp,experimental_constrained_uitofp)

				// llvm.vp.sitofp(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_sitofp, 3, 4)
				HANDLE_VP_TO_OC(vp_sitofp, SIToFP)
				HANDLE_VP_FPCONSTRAINT(vp_sitofp, 1, 2)
				REGISTER_VP_SDNODE(VP_SINT_TO_FP, "vp_sint_to_fp", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_sitofp,VP_SINT_TO_FP)
				// HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_sitofp,experimental_constrained_sitofp)

				// llvm.vp.round(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_round, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_round, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_round, experimental_constrained_round)
				HANDLE_VP_TO_INTRIN(vp_round, round)
				REGISTER_VP_SDNODE(VP_FROUND, "vp_fround", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_round,VP_FROUND)

				// llvm.vp.rint(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_rint, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_rint, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_rint, experimental_constrained_rint)
				HANDLE_VP_TO_INTRIN(vp_rint, rint)
				REGISTER_VP_SDNODE(VP_FRINT, "vp_frint", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_rint,VP_FRINT)

				// llvm.vp.nearbyint(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_nearbyint, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_nearbyint, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_nearbyint,
				experimental_constrained_nearbyint)
				HANDLE_VP_TO_INTRIN(vp_nearbyint, nearbyint)
				REGISTER_VP_SDNODE(VP_FNEARBYINT, "vp_fnearbyint", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_nearbyint,VP_FNEARBYINT)

				///// Math Funcs /////

				// llvm.vp.sqrt(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_sqrt, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_sqrt, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_sqrt, experimental_constrained_sqrt)
				HANDLE_VP_TO_INTRIN(vp_sqrt, sqrt)
				REGISTER_VP_SDNODE(VP_FSQRT, "vp_fsqrt", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_sqrt,VP_FSQRT)

				// llvm.vp.pow(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_pow, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_pow, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_pow, experimental_constrained_pow)
				HANDLE_VP_TO_INTRIN(vp_pow, pow)
				REGISTER_VP_SDNODE(VP_FPOW, "vp_fpow", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_pow,VP_FPOW)

				// llvm.vp.powi(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_powi, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_powi, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_powi, experimental_constrained_powi)
				HANDLE_VP_TO_INTRIN(vp_powi, powi)
				REGISTER_VP_SDNODE(VP_FPOWI, "vp_fpowi", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_powi,VP_FPOWI)

				// llvm.vp.maxnum(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_maxnum, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_maxnum, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_maxnum, experimental_constrained_maxnum)
				HANDLE_VP_TO_INTRIN(vp_maxnum, maxnum)
				REGISTER_VP_SDNODE(VP_FMAXNUM, "vp_fmaxnum", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_maxnum,VP_FMAXNUM)

				// llvm.vp.minnum(x,y,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_minnum, 4, 5)
				HANDLE_VP_FPCONSTRAINT(vp_minnum, 2, 3)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_minnum, experimental_constrained_minnum)
				HANDLE_VP_TO_INTRIN(vp_minnum, minnum)
				REGISTER_VP_SDNODE(VP_FMINNUM, "vp_fminnum", 4, 5)
				HANDLE_VP_TO_SDNODE(vp_minnum,VP_FMINNUM)

				// llvm.vp.sin(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_sin, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_sin, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_sin, experimental_constrained_sin)
				HANDLE_VP_TO_INTRIN(vp_sin, sin)
				REGISTER_VP_SDNODE(VP_FSIN, "vp_fsin", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_sin,VP_FSIN)

				// llvm.vp.cos(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_cos, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_cos, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_cos, experimental_constrained_cos)
				HANDLE_VP_TO_INTRIN(vp_cos, cos)
				REGISTER_VP_SDNODE(VP_FCOS, "vp_fcos", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_cos,VP_FCOS)

				// llvm.vp.log(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_log, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_log, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_log, experimental_constrained_log)
				HANDLE_VP_TO_INTRIN(vp_log, log)
				REGISTER_VP_SDNODE(VP_FLOG, "vp_flog", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_log,VP_FLOG)

				// llvm.vp.log10(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_log10, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_log10, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_log10, experimental_constrained_log10)
				HANDLE_VP_TO_INTRIN(vp_log10, log10)
				REGISTER_VP_SDNODE(VP_FLOG10, "vp_flog10", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_log10,VP_FLOG10)

				// llvm.vp.log2(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_log2, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_log2, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_log2, experimental_constrained_log2)
				HANDLE_VP_TO_INTRIN(vp_log2, log2)
				REGISTER_VP_SDNODE(VP_FLOG2, "vp_flog2", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_log2,VP_FLOG2)

				// llvm.vp.exp(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_exp, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_exp, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_exp, experimental_constrained_exp)
				HANDLE_VP_TO_INTRIN(vp_exp, exp)
				REGISTER_VP_SDNODE(VP_FEXP, "vp_fexp", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_exp,VP_FEXP)

				// llvm.vp.exp2(x,round,except,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_exp2, 3, 4)
				HANDLE_VP_FPCONSTRAINT(vp_exp2, 1, 2)
				HANDLE_VP_TO_CONSTRAINED_INTRIN(vp_exp2, experimental_constrained_exp2)
				HANDLE_VP_TO_INTRIN(vp_exp2, exp2)
				REGISTER_VP_SDNODE(VP_FEXP2, "vp_fexp2", 3, 4)
				HANDLE_VP_TO_SDNODE(vp_exp2,VP_FEXP2)

				///// Comparison /////

				// llvm.vp.fcmp(x,y,pred,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_fcmp, 3, 4)
				HANDLE_VP_TO_OC(vp_fcmp, FCmp)
				HANDLE_VP_IS_XCMP(vp_fcmp)

				// llvm.vp.icmp(x,y,cmp_pred,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_icmp, 3, 4)
				HANDLE_VP_TO_OC(vp_icmp, ICmp)
				HANDLE_VP_IS_XCMP(vp_icmp)

				REGISTER_VP_SDNODE(VP_SETCC, "vp_setcc", 3, 4)

				///// Memory Operations /////

				// llvm.vp.store(ptr,val,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_store, 2, 3)
				HANDLE_VP_TO_OC(vp_store, Store)
				HANDLE_VP_TO_INTRIN(vp_store, masked_store)
				HANDLE_VP_IS_MEMOP(vp_store, 1, 0)
				REGISTER_VP_SDNODE(VP_STORE, "vp_store", 3, 4)

				// llvm.vp.scatter(ptr,val,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_scatter, 2, 3)
				HANDLE_VP_TO_INTRIN(vp_scatter, masked_scatter)
				HANDLE_VP_IS_MEMOP(vp_scatter, 1, 0)
				REGISTER_VP_SDNODE(VP_SCATTER, "vp_scatter", 3, 4)

				// llvm.vp.load(ptr,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_load, 1, 2)
				HANDLE_VP_TO_OC(vp_load, Load)
				HANDLE_VP_TO_INTRIN(vp_load, masked_load)
				HANDLE_VP_IS_MEMOP(vp_load, 0, None)
				REGISTER_VP_SDNODE(VP_LOAD, "vp_load", 2, 3)

				// llvm.vp.gather(ptr,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_gather, 1, 2)
				HANDLE_VP_TO_INTRIN(vp_gather, masked_gather)
				HANDLE_VP_IS_MEMOP(vp_gather, 0, None)
				REGISTER_VP_SDNODE(VP_GATHER, "vp_gather", 1, 2)

				///// Shuffle & Blend /////

				// llvm.vp.compress(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_compress, 1, 2)
				REGISTER_VP_SDNODE(VP_COMPRESS, "vp_compress", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_compress,VP_COMPRESS)

				// llvm.vp.expand(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_expand, 1, 2)
				REGISTER_VP_SDNODE(VP_EXPAND, "vp_expand", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_expand,VP_EXPAND)

				// llvm.vp.vshift(x,amount,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_vshift, 2, 3)
				REGISTER_VP_SDNODE(VP_VSHIFT, "vp_vshift", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_vshift,VP_VSHIFT)

				// llvm.vp.select(mask,on_true,on_false,vlen)
				REGISTER_VP_INTRINSIC(vp_select, 0, 3)
				HANDLE_VP_TO_OC(vp_select, Select)
				REGISTER_VP_SDNODE(VP_SELECT, "vp_select", 0, 3)
				HANDLE_VP_TO_SDNODE(vp_select,VP_SELECT)

				// llvm.vp.compose(x,y,pivot,vlen)
				REGISTER_VP_INTRINSIC(vp_compose, None, 3)
				REGISTER_VP_SDNODE(VP_COMPOSE, "vp_compose", None, 3)
				HANDLE_VP_TO_SDNODE(vp_compose,VP_COMPOSE)

				///// Reduction /////

				// llvm.vp.reduce.add(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_add, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_add, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_add, experimental_vector_reduce_add)
				REGISTER_VP_SDNODE(VP_REDUCE_ADD, "vp_reduce_add", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_add,VP_REDUCE_ADD)

				// llvm.vp.reduce.mul(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_mul, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_mul, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_mul, experimental_vector_reduce_mul)
				REGISTER_VP_SDNODE(VP_REDUCE_MUL, "vp_reduce_mul", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_mul,VP_REDUCE_MUL)

				// llvm.vp.reduce.and(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_and, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_and, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_and, experimental_vector_reduce_and)
				REGISTER_VP_SDNODE(VP_REDUCE_AND, "vp_reduce_and", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_and,VP_REDUCE_AND)

				// llvm.vp.reduce.or(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_or, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_or, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_or, experimental_vector_reduce_or)
				REGISTER_VP_SDNODE(VP_REDUCE_OR, "vp_reduce_or", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_or,VP_REDUCE_OR)

				// llvm.vp.reduce.xor(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_xor, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_xor, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_xor, experimental_vector_reduce_xor)
				REGISTER_VP_SDNODE(VP_REDUCE_XOR, "vp_reduce_xor", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_xor,VP_REDUCE_XOR)

				// llvm.vp.reduce.smin(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_smin, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_smin, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_smin, experimental_vector_reduce_smin)
				REGISTER_VP_SDNODE(VP_REDUCE_SMIN, "vp_reduce_smin", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_smin,VP_REDUCE_SMIN)

				// llvm.vp.reduce.smax(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_smax, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_smax, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_smax, experimental_vector_reduce_smax)
				REGISTER_VP_SDNODE(VP_REDUCE_SMAX, "vp_reduce_smax", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_smax,VP_REDUCE_SMAX)

				// llvm.vp.reduce.umin(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_umin, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_umin, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_umin, experimental_vector_reduce_umin)
				REGISTER_VP_SDNODE(VP_REDUCE_UMIN, "vp_reduce_umin", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_umin,VP_REDUCE_UMIN)

				// llvm.vp.reduce.umax(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_umax, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_umax, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_umax, experimental_vector_reduce_umax)
				REGISTER_VP_SDNODE(VP_REDUCE_UMAX, "vp_reduce_umax", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_umax,VP_REDUCE_UMAX)

				// llvm.vp.reduce.fadd(accu,x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_fadd, 2, 3)
				HANDLE_VP_REDUCTION(vp_reduce_fadd, 0, 1)
				HANDLE_VP_TO_INTRIN(vp_reduce_fadd, experimental_vector_reduce_v2_fadd)
				REGISTER_VP_SDNODE(VP_REDUCE_FADD, "vp_reduce_fadd", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_reduce_fadd,VP_REDUCE_FADD)

				// llvm.vp.reduce.fmul(accu,x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_fmul, 2, 3)
				HANDLE_VP_REDUCTION(vp_reduce_fmul, 0, 1)
				HANDLE_VP_TO_INTRIN(vp_reduce_fmul, experimental_vector_reduce_v2_fmul)
				REGISTER_VP_SDNODE(VP_REDUCE_FMUL, "vp_reduce_fmul", 2, 3)
				HANDLE_VP_TO_SDNODE(vp_reduce_fmul,VP_REDUCE_FMUL)

				// llvm.vp.reduce.fmin(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_fmin, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_fmin, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_fmin, experimental_vector_reduce_fmin)
				REGISTER_VP_SDNODE(VP_REDUCE_FMIN, "vp_reduce_fmin", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_fmin,VP_REDUCE_FMIN)

				// llvm.vp.reduce.fmax(x,mask,vlen)
				REGISTER_VP_INTRINSIC(vp_reduce_fmax, 1, 2)
				HANDLE_VP_REDUCTION(vp_reduce_fmax, None, 0)
				HANDLE_VP_TO_INTRIN(vp_reduce_fmax, experimental_vector_reduce_fmax)
				REGISTER_VP_SDNODE(VP_REDUCE_FMAX, "vp_reduce_fmax", 1, 2)
				HANDLE_VP_TO_SDNODE(vp_reduce_fmax,VP_REDUCE_FMAX)

				#undef REGISTER_VP_INTRINSIC
				#undef REGISTER_VP_SDNODE
				#undef HANDLE_VP_IS_UNARY
				#undef HANDLE_VP_IS_BINARY
				#undef HANDLE_VP_IS_TERNARY
				#undef HANDLE_VP_IS_XCMP
				#undef HANDLE_VP_IS_MEMOP
				#undef HANDLE_VP_TO_OC
				#undef HANDLE_VP_TO_CONSTRAINED_INTRIN
				#undef HANDLE_VP_TO_INTRIN
				#undef HANDLE_VP_FPCONSTRAINT
				#undef HANDLE_VP_REDUCTION
				#undef HANDLE_VP_TO_SDNODE

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines
	void initializeEarlyMachineLICMPass(PassRegistry&);			void initializeEarlyMachineLICMPass(PassRegistry&);
	void initializeEarlyTailDuplicatePass(PassRegistry&);			void initializeEarlyTailDuplicatePass(PassRegistry&);
	void initializeEdgeBundlesPass(PassRegistry&);			void initializeEdgeBundlesPass(PassRegistry&);
	void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);			void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);
	void initializeEntryExitInstrumenterPass(PassRegistry&);			void initializeEntryExitInstrumenterPass(PassRegistry&);
	void initializeExpandMemCmpPassPass(PassRegistry&);			void initializeExpandMemCmpPassPass(PassRegistry&);
	void initializeExpandPostRAPass(PassRegistry&);			void initializeExpandPostRAPass(PassRegistry&);
	void initializeExpandReductionsPass(PassRegistry&);			void initializeExpandReductionsPass(PassRegistry&);
				void initializeExpandVectorPredicationPass(PassRegistry&);
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -void initializeExpandVectorPredicationPass(PassRegistry&); +void initializeExpandVectorPredicationPass(PassRegistry &); Lint: Pre-merge checks: clang-format: please reformat the code ``` -void initializeExpandVectorPredicationPass…
	void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);			void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);
	void initializeExternalAAWrapperPassPass(PassRegistry&);			void initializeExternalAAWrapperPassPass(PassRegistry&);
	void initializeFEntryInserterPass(PassRegistry&);			void initializeFEntryInserterPass(PassRegistry&);
	void initializeFinalizeISelPass(PassRegistry&);			void initializeFinalizeISelPass(PassRegistry&);
	void initializeFinalizeMachineBundlesPass(PassRegistry&);			void initializeFinalizeMachineBundlesPass(PassRegistry&);
	void initializeFlattenCFGPassPass(PassRegistry&);			void initializeFlattenCFGPassPass(PassRegistry&);
	void initializeFloat2IntLegacyPassPass(PassRegistry&);			void initializeFloat2IntLegacyPassPass(PassRegistry&);
	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 279 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
]>;		]>;
def SDTIntBinHiLoOp : SDTypeProfile<2, 2, [ // mulhi, mullo, sdivrem, udivrem		def SDTIntBinHiLoOp : SDTypeProfile<2, 2, [ // mulhi, mullo, sdivrem, udivrem
SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisSameAs<0, 3>,SDTCisInt<0>		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisSameAs<0, 3>,SDTCisInt<0>
]>;		]>;
def SDTIntScaledBinOp : SDTypeProfile<1, 3, [ // smulfix, sdivfix, etc		def SDTIntScaledBinOp : SDTypeProfile<1, 3, [ // smulfix, sdivfix, etc
SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<0>, SDTCisInt<3>		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<0>, SDTCisInt<3>
]>;		]>;

		def SDTIntBinOpVP : SDTypeProfile<1, 4, [ // vp_add, vp_and, etc.
		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<0>, SDTCisInt<4>, SDTCisSameNumEltsAs<0, 3>
		]>;
		def SDTIntShiftOpVP : SDTypeProfile<1, 4, [ // shl, sra, srl
		SDTCisSameAs<0, 1>, SDTCisInt<0>, SDTCisInt<2>, SDTCisInt<4>, SDTCisSameNumEltsAs<0, 3>
		]>;

def SDTFPBinOp : SDTypeProfile<1, 2, [ // fadd, fmul, etc.		def SDTFPBinOp : SDTypeProfile<1, 2, [ // fadd, fmul, etc.
SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisFP<0>		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisFP<0>
]>;		]>;
def SDTFPSignOp : SDTypeProfile<1, 2, [ // fcopysign.		def SDTFPSignOp : SDTypeProfile<1, 2, [ // fcopysign.
SDTCisSameAs<0, 1>, SDTCisFP<0>, SDTCisFP<2>		SDTCisSameAs<0, 1>, SDTCisFP<0>, SDTCisFP<2>
]>;		]>;
def SDTFPTernaryOp : SDTypeProfile<1, 3, [ // fmadd, fnmsub, etc.		def SDTFPTernaryOp : SDTypeProfile<1, 3, [ // fmadd, fnmsub, etc.
SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisSameAs<0, 3>, SDTCisFP<0>		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisSameAs<0, 3>, SDTCisFP<0>
Show All 20 Lines	def SDTFPExtendOp : SDTypeProfile<1, 1, [ // fextend
SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp		def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp
SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int		def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int
SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
		def SDTIntToFPOpVP : SDTypeProfile<1, 3, [ // [su]int_to_fp
		SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>, SDTCisInt<3>, SDTCisSameNumEltsAs<0, 2>
		]>;
		def SDTFPToIntOpVP : SDTypeProfile<1, 3, [ // fp_to_[su]int
		SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>, SDTCisInt<3>, SDTCisSameNumEltsAs<0, 2>
		]>;
def SDTExtInreg : SDTypeProfile<1, 2, [ // sext_inreg		def SDTExtInreg : SDTypeProfile<1, 2, [ // sext_inreg
SDTCisSameAs<0, 1>, SDTCisInt<0>, SDTCisVT<2, OtherVT>,		SDTCisSameAs<0, 1>, SDTCisInt<0>, SDTCisVT<2, OtherVT>,
SDTCisVTSmallerThanOp<2, 1>		SDTCisVTSmallerThanOp<2, 1>
]>;		]>;
def SDTExtInvec : SDTypeProfile<1, 1, [ // sext_invec		def SDTExtInvec : SDTypeProfile<1, 1, [ // sext_invec
SDTCisInt<0>, SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<1>,		SDTCisInt<0>, SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<1>,
SDTCisOpSmallerThanOp<1, 0>		SDTCisOpSmallerThanOp<1, 0>
]>;		]>;

		def SDTFPUnOpVP : SDTypeProfile<1, 3, [ // vp_fneg, etc.
		SDTCisSameAs<0, 1>, SDTCisFP<0>, SDTCisInt<2>, SDTCisSameNumEltsAs<0, 2>, SDTCisInt<3>
		]>;
		def SDTFPBinOpVP : SDTypeProfile<1, 4, [ // vp_fadd, etc.
		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisFP<0>, SDTCisInt<3>, SDTCisSameNumEltsAs<0, 3>, SDTCisInt<4>
		]>;
		def SDTFPTernaryOpVP : SDTypeProfile<1, 5, [ // vp_fmadd, etc.
		SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisSameAs<0, 3>, SDTCisFP<0>, SDTCisInt<4>, SDTCisSameNumEltsAs<0, 4>, SDTCisInt<5>
		]>;

def SDTSetCC : SDTypeProfile<1, 3, [ // setcc		def SDTSetCC : SDTypeProfile<1, 3, [ // setcc
SDTCisInt<0>, SDTCisSameAs<1, 2>, SDTCisVT<3, OtherVT>		SDTCisInt<0>, SDTCisSameAs<1, 2>, SDTCisVT<3, OtherVT>
]>;		]>;
		def SDTSetCCVP : SDTypeProfile<1, 5, [ // vp_setcc
		SDTCisInt<0>, SDTCisSameAs<1, 2>, SDTCisVT<3, OtherVT>, SDTCisInt<4>, SDTCisSameNumEltsAs<0, 4>, SDTCisInt<5>
		]>;

def SDTSelect : SDTypeProfile<1, 3, [ // select		def SDTSelect : SDTypeProfile<1, 3, [ // select
SDTCisInt<1>, SDTCisSameAs<0, 2>, SDTCisSameAs<2, 3>		SDTCisInt<1>, SDTCisSameAs<0, 2>, SDTCisSameAs<2, 3>
]>;		]>;

def SDTVSelect : SDTypeProfile<1, 3, [ // vselect		def SDTVSelect : SDTypeProfile<1, 3, [ // vselect
SDTCisVec<0>, SDTCisInt<1>, SDTCisSameAs<0, 2>, SDTCisSameAs<2, 3>, SDTCisSameNumEltsAs<0, 1>		SDTCisVec<0>, SDTCisInt<1>, SDTCisSameAs<0, 2>, SDTCisSameAs<2, 3>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;

		def SDTSelectVP : SDTypeProfile<1, 4, [ // vp_select
		SDTCisVec<0>, SDTCisInt<1>, SDTCisSameAs<0, 2>, SDTCisSameAs<2, 3>, SDTCisSameNumEltsAs<0, 1>, SDTCisInt<4>
		]>;

def SDTSelectCC : SDTypeProfile<1, 5, [ // select_cc		def SDTSelectCC : SDTypeProfile<1, 5, [ // select_cc
SDTCisSameAs<1, 2>, SDTCisSameAs<3, 4>, SDTCisSameAs<0, 3>,		SDTCisSameAs<1, 2>, SDTCisSameAs<3, 4>, SDTCisSameAs<0, 3>,
SDTCisVT<5, OtherVT>		SDTCisVT<5, OtherVT>
]>;		]>;

def SDTBr : SDTypeProfile<0, 1, [ // br		def SDTBr : SDTypeProfile<0, 1, [ // br
SDTCisVT<0, OtherVT>		SDTCisVT<0, OtherVT>
]>;		]>;
Show All 32 Lines	def SDTMaskedStore: SDTypeProfile<0, 4, [ // masked store
SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisPtrTy<2>, SDTCisVec<3>, SDTCisSameNumEltsAs<0, 3>		SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisPtrTy<2>, SDTCisVec<3>, SDTCisSameNumEltsAs<0, 3>
]>;		]>;

def SDTMaskedLoad: SDTypeProfile<1, 4, [ // masked load		def SDTMaskedLoad: SDTypeProfile<1, 4, [ // masked load
SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisPtrTy<2>, SDTCisVec<3>, SDTCisSameAs<0, 4>,		SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisPtrTy<2>, SDTCisVec<3>, SDTCisSameAs<0, 4>,
SDTCisSameNumEltsAs<0, 3>		SDTCisSameNumEltsAs<0, 3>
]>;		]>;

		def SDTStoreVP: SDTypeProfile<0, 4, [ // vp store
		SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisVec<2>, SDTCisSameNumEltsAs<0, 2>, SDTCisInt<3>
		]>;

		// scatter (Value, BasePtr, Index, Scale, Mask, Vlen)
		def SDTScatterVP: SDTypeProfile<0, 6, [ // vp scatter
		SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<2>, SDTCisInt<3>, SDTCisVec<4>, SDTCisSameNumEltsAs<0, 2>, SDTCisSameNumEltsAs<2, 4>, SDTCisInt<5>
		]>;

		// gather (BasePtr, Index, Scale, Mask, Vlen)
		def SDTGatherVP: SDTypeProfile<1, 5, [ // vp gather
		SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<2>, SDTCisInt<3>, SDTCisVec<4>, SDTCisSameNumEltsAs<0, 2>, SDTCisSameNumEltsAs<2, 4>, SDTCisInt<5>
		]>;

		def SDTLoadVP : SDTypeProfile<1, 3, [ // vp load
		SDTCisVec<0>, SDTCisPtrTy<1>, SDTCisSameNumEltsAs<0, 2>, SDTCisInt<3>
		]>;

def SDTVecShuffle : SDTypeProfile<1, 2, [		def SDTVecShuffle : SDTypeProfile<1, 2, [
SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>		SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>
]>;		]>;
		def SDTVShiftVP : SDTypeProfile<1, 4, [
		SDTCisSameAs<0, 1>, SDTCisVec<0>, SDTCisVec<3>, SDTCisSameNumEltsAs<3,1>, SDTCisInt<4>
		]>;
def SDTVecExtract : SDTypeProfile<1, 2, [ // vector extract		def SDTVecExtract : SDTypeProfile<1, 2, [ // vector extract
SDTCisEltOfVec<0, 1>, SDTCisPtrTy<2>		SDTCisEltOfVec<0, 1>, SDTCisPtrTy<2>
]>;		]>;
def SDTVecInsert : SDTypeProfile<1, 3, [ // vector insert		def SDTVecInsert : SDTypeProfile<1, 3, [ // vector insert
SDTCisEltOfVec<2, 1>, SDTCisSameAs<0, 1>, SDTCisPtrTy<3>		SDTCisEltOfVec<2, 1>, SDTCisSameAs<0, 1>, SDTCisPtrTy<3>
]>;		]>;
def SDTVecReduce : SDTypeProfile<1, 1, [ // vector reduction		def SDTVecReduce : SDTypeProfile<1, 1, [ // vector reduction
SDTCisInt<0>, SDTCisVec<1>		SDTCisInt<0>, SDTCisVec<1>
]>;		]>;
		def SDTReduceVP : SDTypeProfile<1, 3, [ // vp_reduce (no start arg)
		SDTCisVec<1>, SDTCisInt<2>, SDTCisVec<2>, SDTCisInt<3>, SDTCisSameNumEltsAs<1,2>
		]>;
		def SDTReduceStartVP : SDTypeProfile<1, 4, [ // vp_reduce (with start arg)
		SDTCisVec<2>, SDTCisInt<3>, SDTCisVec<3>, SDTCisInt<4>, SDTCisSameNumEltsAs<2,3>
		]>;

def SDTSubVecExtract : SDTypeProfile<1, 2, [// subvector extract		def SDTSubVecExtract : SDTypeProfile<1, 2, [// subvector extract
SDTCisSubVecOfVec<0,1>, SDTCisInt<2>		SDTCisSubVecOfVec<0,1>, SDTCisInt<2>
]>;		]>;
def SDTSubVecInsert : SDTypeProfile<1, 3, [ // subvector insert		def SDTSubVecInsert : SDTypeProfile<1, 3, [ // subvector insert
SDTCisSubVecOfVec<2, 1>, SDTCisSameAs<0,1>, SDTCisInt<3>		SDTCisSubVecOfVec<2, 1>, SDTCisSameAs<0,1>, SDTCisInt<3>
]>;		]>;

▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	def smin : SDNode<"ISD::SMIN" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;
def smax : SDNode<"ISD::SMAX" , SDTIntBinOp,		def smax : SDNode<"ISD::SMAX" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;
def umin : SDNode<"ISD::UMIN" , SDTIntBinOp,		def umin : SDNode<"ISD::UMIN" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;
def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,		def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;

		// TODO SDNPCommutative/SDNPAssociative for VP operators.
		def vp_and : SDNode<"ISD::VP_AND" , SDTIntBinOpVP>;
		def vp_or : SDNode<"ISD::VP_OR" , SDTIntBinOpVP>;
		def vp_xor : SDNode<"ISD::VP_XOR" , SDTIntBinOpVP>;
		def vp_srl : SDNode<"ISD::VP_SRL" , SDTIntShiftOpVP>;
		def vp_sra : SDNode<"ISD::VP_SRA" , SDTIntShiftOpVP>;
		def vp_shl : SDNode<"ISD::VP_SHL" , SDTIntShiftOpVP>;

		def vp_add : SDNode<"ISD::VP_ADD" , SDTIntBinOpVP>;
		def vp_sub : SDNode<"ISD::VP_SUB" , SDTIntBinOpVP>;
		def vp_mul : SDNode<"ISD::VP_MUL" , SDTIntBinOpVP>;
		def vp_sdiv : SDNode<"ISD::VP_SDIV" , SDTIntBinOpVP>;
		def vp_udiv : SDNode<"ISD::VP_UDIV" , SDTIntBinOpVP>;
		def vp_srem : SDNode<"ISD::VP_SREM" , SDTIntBinOpVP>;
		def vp_urem : SDNode<"ISD::VP_UREM" , SDTIntBinOpVP>;

def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;		def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;
def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;		def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;

def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;		def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;		def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;
def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;		def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
Show All 25 Lines
def insertelt : SDNode<"ISD::INSERT_VECTOR_ELT", SDTVecInsert>;		def insertelt : SDNode<"ISD::INSERT_VECTOR_ELT", SDTVecInsert>;

def vecreduce_add : SDNode<"ISD::VECREDUCE_ADD", SDTVecReduce>;		def vecreduce_add : SDNode<"ISD::VECREDUCE_ADD", SDTVecReduce>;
def vecreduce_smax : SDNode<"ISD::VECREDUCE_SMAX", SDTVecReduce>;		def vecreduce_smax : SDNode<"ISD::VECREDUCE_SMAX", SDTVecReduce>;
def vecreduce_umax : SDNode<"ISD::VECREDUCE_UMAX", SDTVecReduce>;		def vecreduce_umax : SDNode<"ISD::VECREDUCE_UMAX", SDTVecReduce>;
def vecreduce_smin : SDNode<"ISD::VECREDUCE_SMIN", SDTVecReduce>;		def vecreduce_smin : SDNode<"ISD::VECREDUCE_SMIN", SDTVecReduce>;
def vecreduce_umin : SDNode<"ISD::VECREDUCE_UMIN", SDTVecReduce>;		def vecreduce_umin : SDNode<"ISD::VECREDUCE_UMIN", SDTVecReduce>;

		def vp_reduce_add : SDNode<"ISD::VP_REDUCE_ADD", SDTReduceVP>;
		def vp_reduce_smax : SDNode<"ISD::VP_REDUCE_SMAX", SDTReduceVP>;
		def vp_reduce_umax : SDNode<"ISD::VP_REDUCE_UMAX", SDTReduceVP>;
		def vp_reduce_smin : SDNode<"ISD::VP_REDUCE_SMIN", SDTReduceVP>;
		def vp_reduce_umin : SDNode<"ISD::VP_REDUCE_UMIN", SDTReduceVP>;
		def vp_reduce_and : SDNode<"ISD::VP_REDUCE_AND", SDTReduceVP>;
		def vp_reduce_or : SDNode<"ISD::VP_REDUCE_OR", SDTReduceVP>;
		def vp_reduce_xor : SDNode<"ISD::VP_REDUCE_XOR", SDTReduceVP>;

		def vp_reduce_fadd : SDNode<"ISD::VP_REDUCE_FADD", SDTReduceStartVP>;
		def vp_reduce_fmul : SDNode<"ISD::VP_REDUCE_FMUL", SDTReduceStartVP>;
		def vp_reduce_fmin : SDNode<"ISD::VP_REDUCE_FMIN", SDTReduceVP>;
		def vp_reduce_fmax : SDNode<"ISD::VP_REDUCE_FMAX", SDTReduceVP>;

def fadd : SDNode<"ISD::FADD" , SDTFPBinOp, [SDNPCommutative]>;		def fadd : SDNode<"ISD::FADD" , SDTFPBinOp, [SDNPCommutative]>;
def fsub : SDNode<"ISD::FSUB" , SDTFPBinOp>;		def fsub : SDNode<"ISD::FSUB" , SDTFPBinOp>;
def fmul : SDNode<"ISD::FMUL" , SDTFPBinOp, [SDNPCommutative]>;		def fmul : SDNode<"ISD::FMUL" , SDTFPBinOp, [SDNPCommutative]>;
def fdiv : SDNode<"ISD::FDIV" , SDTFPBinOp>;		def fdiv : SDNode<"ISD::FDIV" , SDTFPBinOp>;
def frem : SDNode<"ISD::FREM" , SDTFPBinOp>;		def frem : SDNode<"ISD::FREM" , SDTFPBinOp>;
def fma : SDNode<"ISD::FMA" , SDTFPTernaryOp>;		def fma : SDNode<"ISD::FMA" , SDTFPTernaryOp>;
def fmad : SDNode<"ISD::FMAD" , SDTFPTernaryOp>;		def fmad : SDNode<"ISD::FMAD" , SDTFPTernaryOp>;
def fabs : SDNode<"ISD::FABS" , SDTFPUnaryOp>;		def fabs : SDNode<"ISD::FABS" , SDTFPUnaryOp>;
Show All 29 Lines
def llround : SDNode<"ISD::LLROUND" , SDTFPToIntOp>;		def llround : SDNode<"ISD::LLROUND" , SDTFPToIntOp>;
def lrint : SDNode<"ISD::LRINT" , SDTFPToIntOp>;		def lrint : SDNode<"ISD::LRINT" , SDTFPToIntOp>;
def llrint : SDNode<"ISD::LLRINT" , SDTFPToIntOp>;		def llrint : SDNode<"ISD::LLRINT" , SDTFPToIntOp>;

def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;		def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;
def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;		def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;
def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;		def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;

		// vector predication

		def vp_fneg : SDNode<"ISD::VP_FNEG" , SDTFPUnOpVP>;
		def vp_fadd : SDNode<"ISD::VP_FADD" , SDTFPBinOpVP>;
		def vp_fsub : SDNode<"ISD::VP_FSUB" , SDTFPBinOpVP>;
		def vp_fmul : SDNode<"ISD::VP_FMUL" , SDTFPBinOpVP>;
		def vp_fdiv : SDNode<"ISD::VP_FDIV" , SDTFPBinOpVP>;
		def vp_frem : SDNode<"ISD::VP_FREM" , SDTFPBinOpVP>;
		def vp_fminnum : SDNode<"ISD::VP_FMINNUM" , SDTFPBinOpVP>;
		def vp_fmaxnum : SDNode<"ISD::VP_FMAXNUM" , SDTFPBinOpVP>;
		def vp_fma : SDNode<"ISD::VP_FMA" , SDTFPTernaryOpVP>;

def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;		def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;
def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;		def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;
def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;		def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;
def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;		def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;
def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;		def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;
def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;		def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;

		def vp_sint_to_fp : SDNode<"ISD::VP_SINT_TO_FP" , SDTIntToFPOpVP>;
		def vp_uint_to_fp : SDNode<"ISD::VP_UINT_TO_FP" , SDTIntToFPOpVP>;
		def vp_fp_to_sint : SDNode<"ISD::VP_FP_TO_SINT" , SDTFPToIntOpVP>;
		def vp_fp_to_uint : SDNode<"ISD::VP_FP_TO_UINT" , SDTFPToIntOpVP>;

def strict_fadd : SDNode<"ISD::STRICT_FADD",		def strict_fadd : SDNode<"ISD::STRICT_FADD",
SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;		SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;
def strict_fsub : SDNode<"ISD::STRICT_FSUB",		def strict_fsub : SDNode<"ISD::STRICT_FSUB",
SDTFPBinOp, [SDNPHasChain]>;		SDTFPBinOp, [SDNPHasChain]>;
def strict_fmul : SDNode<"ISD::STRICT_FMUL",		def strict_fmul : SDNode<"ISD::STRICT_FMUL",
SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;		SDTFPBinOp, [SDNPHasChain, SDNPCommutative]>;
def strict_fdiv : SDNode<"ISD::STRICT_FDIV",		def strict_fdiv : SDNode<"ISD::STRICT_FDIV",
SDTFPBinOp, [SDNPHasChain]>;		SDTFPBinOp, [SDNPHasChain]>;
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
def strict_fp_to_uint : SDNode<"ISD::STRICT_FP_TO_UINT",		def strict_fp_to_uint : SDNode<"ISD::STRICT_FP_TO_UINT",
SDTFPToIntOp, [SDNPHasChain]>;		SDTFPToIntOp, [SDNPHasChain]>;
def strict_sint_to_fp : SDNode<"ISD::STRICT_SINT_TO_FP",		def strict_sint_to_fp : SDNode<"ISD::STRICT_SINT_TO_FP",
SDTIntToFPOp, [SDNPHasChain]>;		SDTIntToFPOp, [SDNPHasChain]>;
def strict_uint_to_fp : SDNode<"ISD::STRICT_UINT_TO_FP",		def strict_uint_to_fp : SDNode<"ISD::STRICT_UINT_TO_FP",
SDTIntToFPOp, [SDNPHasChain]>;		SDTIntToFPOp, [SDNPHasChain]>;

def setcc : SDNode<"ISD::SETCC" , SDTSetCC>;		def setcc : SDNode<"ISD::SETCC" , SDTSetCC>;
		def vp_setcc : SDNode<"ISD::VP_SETCC" , SDTSetCCVP>;
def select : SDNode<"ISD::SELECT" , SDTSelect>;		def select : SDNode<"ISD::SELECT" , SDTSelect>;
def vselect : SDNode<"ISD::VSELECT" , SDTVSelect>;		def vselect : SDNode<"ISD::VSELECT" , SDTVSelect>;
		def vp_select : SDNode<"ISD::VP_SELECT" , SDTSelectVP>;
def selectcc : SDNode<"ISD::SELECT_CC" , SDTSelectCC>;		def selectcc : SDNode<"ISD::SELECT_CC" , SDTSelectCC>;

def brcc : SDNode<"ISD::BR_CC" , SDTBrCC, [SDNPHasChain]>;		def brcc : SDNode<"ISD::BR_CC" , SDTBrCC, [SDNPHasChain]>;
def brcond : SDNode<"ISD::BRCOND" , SDTBrcond, [SDNPHasChain]>;		def brcond : SDNode<"ISD::BRCOND" , SDTBrcond, [SDNPHasChain]>;
def brind : SDNode<"ISD::BRIND" , SDTBrind, [SDNPHasChain]>;		def brind : SDNode<"ISD::BRIND" , SDTBrind, [SDNPHasChain]>;
def br : SDNode<"ISD::BR" , SDTBr, [SDNPHasChain]>;		def br : SDNode<"ISD::BR" , SDTBr, [SDNPHasChain]>;
def catchret : SDNode<"ISD::CATCHRET" , SDTCatchret,		def catchret : SDNode<"ISD::CATCHRET" , SDTCatchret,
[SDNPHasChain, SDNPSideEffect]>;		[SDNPHasChain, SDNPSideEffect]>;
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
def atomic_store : SDNode<"ISD::ATOMIC_STORE", SDTAtomicStore,		def atomic_store : SDNode<"ISD::ATOMIC_STORE", SDTAtomicStore,
[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;

def masked_st : SDNode<"ISD::MSTORE", SDTMaskedStore,		def masked_st : SDNode<"ISD::MSTORE", SDTMaskedStore,
[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
def masked_ld : SDNode<"ISD::MLOAD", SDTMaskedLoad,		def masked_ld : SDNode<"ISD::MLOAD", SDTMaskedLoad,
[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;

		def vp_store : SDNode<"ISD::VP_STORE", SDTStoreVP,
		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
		def vp_load : SDNode<"ISD::VP_LOAD", SDTLoadVP,
		[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;

		def vp_scatter : SDNode<"ISD::VP_SCATTER", SDTScatterVP,
		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
		def vp_gather : SDNode<"ISD::VP_GATHER", SDTGatherVP,
		[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;

// Do not use ld, st directly. Use load, extload, sextload, zextload, store,		// Do not use ld, st directly. Use load, extload, sextload, zextload, store,
// and truncst (see below).		// and truncst (see below).
def ld : SDNode<"ISD::LOAD" , SDTLoad,		def ld : SDNode<"ISD::LOAD" , SDTLoad,
[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
def st : SDNode<"ISD::STORE" , SDTStore,		def st : SDNode<"ISD::STORE" , SDTStore,
[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
def ist : SDNode<"ISD::STORE" , SDTIStore,		def ist : SDNode<"ISD::STORE" , SDTIStore,
[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
		def vp_vshift : SDNode<"ISD::VP_VSHIFT", SDTVShiftVP, []>;
def vector_shuffle : SDNode<"ISD::VECTOR_SHUFFLE", SDTVecShuffle, []>;		def vector_shuffle : SDNode<"ISD::VECTOR_SHUFFLE", SDTVecShuffle, []>;
def build_vector : SDNode<"ISD::BUILD_VECTOR", SDTypeProfile<1, -1, []>, []>;		def build_vector : SDNode<"ISD::BUILD_VECTOR", SDTypeProfile<1, -1, []>, []>;
def scalar_to_vector : SDNode<"ISD::SCALAR_TO_VECTOR", SDTypeProfile<1, 1, []>,		def scalar_to_vector : SDNode<"ISD::SCALAR_TO_VECTOR", SDTypeProfile<1, 1, []>,
[]>;		[]>;

// vector_extract/vector_insert are deprecated. extractelt/insertelt		// vector_extract/vector_insert are deprecated. extractelt/insertelt
// are preferred.		// are preferred.
def vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",		def vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",
▲ Show 20 Lines • Show All 965 Lines • Show Last 20 Lines

llvm/lib/Analysis/InstructionSimplify.cpp

Show All 31 Lines
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/GetElementPtrTypeIterator.h"		#include "llvm/IR/GetElementPtrTypeIterator.h"
#include "llvm/IR/GlobalAlias.h"		#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
		#include "llvm/IR/PredicatedInst.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include <algorithm>		#include <algorithm>
using namespace llvm;		using namespace llvm;
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

#define DEBUG_TYPE "instsimplify"		#define DEBUG_TYPE "instsimplify"

▲ Show 20 Lines • Show All 4,610 Lines • ▼ Show 20 Lines	if (FMF.noSignedZeros() && FMF.allowReassoc() &&
match(Op1, m_FSub(m_Value(X), m_Specific(Op0)))))		match(Op1, m_FSub(m_Value(X), m_Specific(Op0)))))
return X;		return X;

return nullptr;		return nullptr;
}		}

/// Given operands for an FSub, see if we can fold the result. If not, this		/// Given operands for an FSub, see if we can fold the result. If not, this
/// returns null.		/// returns null.
static Value SimplifyFSubInst(Value Op0, Value *Op1, FastMathFlags FMF,		template<typename MatchContext>
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<typename MatchContext> +template <typename MatchContext> Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<typename MatchContext> +template <typename…
const SimplifyQuery &Q, unsigned MaxRecurse) {		static Value SimplifyFSubInstGeneric(Value Op0, Value *Op1, FastMathFlags FMF,
		const SimplifyQuery &Q, unsigned MaxRecurse, MatchContext & MC) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SimplifyQuery &Q, unsigned MaxRecurse, MatchContext & MC) { + const SimplifyQuery &Q, + unsigned MaxRecurse, MatchContext &MC) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SimplifyQuery…

if (Constant *C = foldOrCommuteConstant(Instruction::FSub, Op0, Op1, Q))		if (Constant *C = foldOrCommuteConstant(Instruction::FSub, Op0, Op1, Q))
return C;		return C;

if (Constant *C = simplifyFPOp({Op0, Op1}))		if (Constant *C = simplifyFPOp({Op0, Op1}))
return C;		return C;

// fsub X, +0 ==> X		// fsub X, +0 ==> X
if (match(Op1, m_PosZeroFP()))		if (MC.try_match(Op1, m_PosZeroFP()))
return Op0;		return Op0;

// fsub X, -0 ==> X, when we know X is not -0		// fsub X, -0 ==> X, when we know X is not -0
if (match(Op1, m_NegZeroFP()) &&		if (MC.try_match(Op1, m_NegZeroFP()) &&
(FMF.noSignedZeros() \|\| CannotBeNegativeZero(Op0, Q.TLI)))		(FMF.noSignedZeros() \|\| CannotBeNegativeZero(Op0, Q.TLI)))
return Op0;		return Op0;

// fsub -0.0, (fsub -0.0, X) ==> X		// fsub -0.0, (fsub -0.0, X) ==> X
// fsub -0.0, (fneg X) ==> X		// fsub -0.0, (fneg X) ==> X
Value *X;		Value *X;
if (match(Op0, m_NegZeroFP()) &&		if (MC.try_match(Op0, m_NegZeroFP()) &&
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (MC.try_match(Op0, m_NegZeroFP()) && - MC.try_match(Op1, m_FNeg(m_Value(X)))) + if (MC.try_match(Op0, m_NegZeroFP()) && MC.try_match(Op1, m_FNeg(m_Value(X)))) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (MC.try_match(Op0, m_NegZeroFP()) && - MC.
match(Op1, m_FNeg(m_Value(X))))		MC.try_match(Op1, m_FNeg(m_Value(X))))
return X;		return X;

// fsub 0.0, (fsub 0.0, X) ==> X if signed zeros are ignored.		// fsub 0.0, (fsub 0.0, X) ==> X if signed zeros are ignored.
// fsub 0.0, (fneg X) ==> X if signed zeros are ignored.		// fsub 0.0, (fneg X) ==> X if signed zeros are ignored.
if (FMF.noSignedZeros() && match(Op0, m_AnyZeroFP()) &&		if (FMF.noSignedZeros() && match(Op0, m_AnyZeroFP()) &&
(match(Op1, m_FSub(m_AnyZeroFP(), m_Value(X))) \|\|		(MC.try_match(Op1, m_FSub(m_AnyZeroFP(), m_Value(X))) \|\|
match(Op1, m_FNeg(m_Value(X)))))		MC.try_match(Op1, m_FNeg(m_Value(X)))))
return X;		return X;

// fsub nnan x, x ==> 0.0		// fsub nnan x, x ==> 0.0
if (FMF.noNaNs() && Op0 == Op1)		if (FMF.noNaNs() && Op0 == Op1)
return Constant::getNullValue(Op0->getType());		return Constant::getNullValue(Op0->getType());

// Y - (Y - X) --> X		// Y - (Y - X) --> X
// (X + Y) - Y --> X		// (X + Y) - Y --> X
if (FMF.noSignedZeros() && FMF.allowReassoc() &&		if (FMF.noSignedZeros() && FMF.allowReassoc() &&
(match(Op1, m_FSub(m_Specific(Op0), m_Value(X))) \|\|		(MC.try_match(Op1, m_FSub(m_Specific(Op0), m_Value(X))) \|\|
match(Op0, m_c_FAdd(m_Specific(Op1), m_Value(X)))))		MC.try_match(Op0, m_c_FAdd(m_Specific(Op1), m_Value(X)))))
return X;		return X;

return nullptr;		return nullptr;
}		}

static Value SimplifyFMAFMul(Value Op0, Value *Op1, FastMathFlags FMF,		static Value SimplifyFMAFMul(Value Op0, Value *Op1, FastMathFlags FMF,
const SimplifyQuery &Q, unsigned MaxRecurse) {		const SimplifyQuery &Q, unsigned MaxRecurse) {
if (Constant *C = simplifyFPOp({Op0, Op1}))		if (Constant *C = simplifyFPOp({Op0, Op1}))
Show All 37 Lines	static Value SimplifyFMulInst(Value Op0, Value *Op1, FastMathFlags FMF,
return SimplifyFMAFMul(Op0, Op1, FMF, Q, MaxRecurse);		return SimplifyFMAFMul(Op0, Op1, FMF, Q, MaxRecurse);
}		}

Value llvm::SimplifyFAddInst(Value Op0, Value *Op1, FastMathFlags FMF,		Value llvm::SimplifyFAddInst(Value Op0, Value *Op1, FastMathFlags FMF,
const SimplifyQuery &Q) {		const SimplifyQuery &Q) {
return ::SimplifyFAddInst(Op0, Op1, FMF, Q, RecursionLimit);		return ::SimplifyFAddInst(Op0, Op1, FMF, Q, RecursionLimit);
}		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - - Lint: Pre-merge checks: clang-format: please reformat the code ``` - - ```

		/// Given operands for an FSub, see if we can fold the result.
		static Value SimplifyFSubInst(Value Op0, Value *Op1, FastMathFlags FMF,
		const SimplifyQuery &Q, unsigned MaxRecurse) {
		if (Constant *C = foldOrCommuteConstant(Instruction::FSub, Op0, Op1, Q))
		return C;

		EmptyContext EC;
		return SimplifyFSubInstGeneric<EmptyContext>(Op0, Op1, FMF, Q, RecursionLimit, EC);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return SimplifyFSubInstGeneric<EmptyContext>(Op0, Op1, FMF, Q, RecursionLimit, EC); + return SimplifyFSubInstGeneric<EmptyContext>(Op0, Op1, FMF, Q, RecursionLimit, + EC); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return SimplifyFSubInstGeneric<EmptyContext>(Op0…
		}

Value llvm::SimplifyFSubInst(Value Op0, Value *Op1, FastMathFlags FMF,		Value llvm::SimplifyFSubInst(Value Op0, Value *Op1, FastMathFlags FMF,
const SimplifyQuery &Q) {		const SimplifyQuery &Q) {
return ::SimplifyFSubInst(Op0, Op1, FMF, Q, RecursionLimit);		// Now apply simplifications that do not require rounding.
		return SimplifyFSubInst(Op0, Op1, FMF, Q, RecursionLimit);
		}

		Value llvm::SimplifyPredicatedFSubInst(Value Op0, Value *Op1, FastMathFlags FMF,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Value llvm::SimplifyPredicatedFSubInst(Value Op0, Value Op1, FastMathFlags FMF, - const SimplifyQuery &Q, PredicatedContext & PC) { - return ::SimplifyFSubInstGeneric<PredicatedContext>(Op0, Op1, FMF, Q, RecursionLimit, PC); +Value llvm::SimplifyPredicatedFSubInst(Value Op0, Value Op1, + FastMathFlags FMF, + const SimplifyQuery &Q, + PredicatedContext &PC) { + return ::SimplifyFSubInstGeneric<PredicatedContext>(Op0, Op1, FMF, Q, + RecursionLimit, PC); Lint: Pre-merge checks: clang-format: please reformat the code ``` -Value llvm::SimplifyPredicatedFSubInst(Value Op0…
		const SimplifyQuery &Q, PredicatedContext & PC) {
		return ::SimplifyFSubInstGeneric<PredicatedContext>(Op0, Op1, FMF, Q, RecursionLimit, PC);
}		}

Value llvm::SimplifyFMulInst(Value Op0, Value *Op1, FastMathFlags FMF,		Value llvm::SimplifyFMulInst(Value Op0, Value *Op1, FastMathFlags FMF,
const SimplifyQuery &Q) {		const SimplifyQuery &Q) {
return ::SimplifyFMulInst(Op0, Op1, FMF, Q, RecursionLimit);		return ::SimplifyFMulInst(Op0, Op1, FMF, Q, RecursionLimit);
}		}

Value llvm::SimplifyFMAFMul(Value Op0, Value *Op1, FastMathFlags FMF,		Value llvm::SimplifyFMAFMul(Value Op0, Value *Op1, FastMathFlags FMF,
▲ Show 20 Lines • Show All 597 Lines • ▼ Show 20 Lines	static Value SimplifyFreezeInst(Value Op0) {
// We have room for improvement.		// We have room for improvement.
return nullptr;		return nullptr;
}		}

Value llvm::SimplifyFreezeInst(Value Op0, const SimplifyQuery &Q) {		Value llvm::SimplifyFreezeInst(Value Op0, const SimplifyQuery &Q) {
return ::SimplifyFreezeInst(Op0);		return ::SimplifyFreezeInst(Op0);
}		}

		Value *llvm::SimplifyVPIntrinsic(VPIntrinsic & VPInst, const SimplifyQuery &Q) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Value llvm::SimplifyVPIntrinsic(VPIntrinsic & VPInst, const SimplifyQuery &Q) { +Value llvm::SimplifyVPIntrinsic(VPIntrinsic &VPInst, const SimplifyQuery &Q) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -Value *llvm::SimplifyVPIntrinsic(VPIntrinsic &…
		PredicatedContext PC(&VPInst);

		auto & PI = cast<PredicatedInstruction>(VPInst);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto & PI = cast<PredicatedInstruction>(VPInst); + auto &PI = cast<PredicatedInstruction>(VPInst); Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto & PI = cast<PredicatedInstruction>(VPInst)…
		switch (PI.getOpcode()) {
		default:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: - return nullptr; + default: + return nullptr; Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: - return nullptr; + default: +…
		return nullptr;

		case Instruction::FSub: return SimplifyPredicatedFSubInst(VPInst.getOperand(0), VPInst.getOperand(1), VPInst.getFastMathFlags(), Q, PC);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case Instruction::FSub: return SimplifyPredicatedFSubInst(VPInst.getOperand(0), VPInst.getOperand(1), VPInst.getFastMathFlags(), Q, PC); + case Instruction::FSub: + return SimplifyPredicatedFSubInst(VPInst.getOperand(0), + VPInst.getOperand(1), + VPInst.getFastMathFlags(), Q, PC); Lint: Pre-merge checks: clang-format: please reformat the code ``` - case Instruction::FSub: return…
		}
		}

/// See if we can compute a simplified version of this instruction.		/// See if we can compute a simplified version of this instruction.
/// If not, this returns null.		/// If not, this returns null.

Value llvm::SimplifyInstruction(Instruction I, const SimplifyQuery &SQ,		Value llvm::SimplifyInstruction(Instruction I, const SimplifyQuery &SQ,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {
const SimplifyQuery Q = SQ.CxtI ? SQ : SQ.getWithInstruction(I);		const SimplifyQuery Q = SQ.CxtI ? SQ : SQ.getWithInstruction(I);
Value *Result;		Value *Result;

switch (I->getOpcode()) {		switch (I->getOpcode()) {
default:		default:
Result = ConstantFoldInstruction(I, Q.DL, Q.TLI);		Result = ConstantFoldInstruction(I, Q.DL, Q.TLI);
▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	case Instruction::ShuffleVector: {
Result = SimplifyShuffleVectorInst(SVI->getOperand(0), SVI->getOperand(1),		Result = SimplifyShuffleVectorInst(SVI->getOperand(0), SVI->getOperand(1),
SVI->getMask(), SVI->getType(), Q);		SVI->getMask(), SVI->getType(), Q);
break;		break;
}		}
case Instruction::PHI:		case Instruction::PHI:
Result = SimplifyPHINode(cast<PHINode>(I), Q);		Result = SimplifyPHINode(cast<PHINode>(I), Q);
break;		break;
case Instruction::Call: {		case Instruction::Call: {
		auto * VPInst = dyn_cast<VPIntrinsic>(I);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto * VPInst = dyn_cast<VPIntrinsic>(I); + auto VPInst = dyn_cast<VPIntrinsic>(I); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - auto * VPInst = dyn_cast<VPIntrinsic>(I); +…
		if (VPInst) {
		Result = SimplifyVPIntrinsic(*VPInst, Q);
		if (Result) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (Result) break; + if (Result) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (Result) break; + if (Result) +…
		}

		CallSite CS((I));
Result = SimplifyCall(cast<CallInst>(I), Q);		Result = SimplifyCall(cast<CallInst>(I), Q);
break;		break;
}		}
case Instruction::Freeze:		case Instruction::Freeze:
Result = SimplifyFreezeInst(I->getOperand(0), Q);		Result = SimplifyFreezeInst(I->getOperand(0), Q);
break;		break;
#define HANDLE_CAST_INST(num, opc, clas) case Instruction::opc:		#define HANDLE_CAST_INST(num, opc, clas) case Instruction::opc:
#include "llvm/IR/Instruction.def"		#include "llvm/IR/Instruction.def"
▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

	Show First 20 Lines • Show All 837 Lines • ▼ Show 20 Lines

	unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,			unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,
	unsigned StoreSize,			unsigned StoreSize,
	unsigned ChainSizeInBytes,			unsigned ChainSizeInBytes,
	VectorType *VecTy) const {			VectorType *VecTy) const {
	return TTIImpl->getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);			return TTIImpl->getStoreVectorFactor(VF, StoreSize, ChainSizeInBytes, VecTy);
	}			}

				bool TargetTransformInfo::shouldFoldVectorLengthIntoMask(
				const PredicatedInstruction &PI) const {
				return TTIImpl->shouldFoldVectorLengthIntoMask(PI);
				}

				bool TargetTransformInfo::supportsVPOperation(
				const PredicatedInstruction &PI) const {
				return TTIImpl->supportsVPOperation(PI);
				}

	bool TargetTransformInfo::useReductionIntrinsic(unsigned Opcode,			bool TargetTransformInfo::useReductionIntrinsic(unsigned Opcode,
	Type *Ty, ReductionFlags Flags) const {			Type *Ty, ReductionFlags Flags) const {
	return TTIImpl->useReductionIntrinsic(Opcode, Ty, Flags);			return TTIImpl->useReductionIntrinsic(Opcode, Ty, Flags);
	}			}

	bool TargetTransformInfo::shouldExpandReduction(const IntrinsicInst *II) const {			bool TargetTransformInfo::shouldExpandReduction(const IntrinsicInst *II) const {
	return TTIImpl->shouldExpandReduction(II);			return TTIImpl->shouldExpandReduction(II);
	}			}
	▲ Show 20 Lines • Show All 547 Lines • Show Last 20 Lines

llvm/lib/AsmParser/LLLexer.cpp

Show First 20 Lines • Show All 640 Lines • ▼ Show 20 Lines	#define KEYWORD(STR) \
KEYWORD(convergent);		KEYWORD(convergent);
KEYWORD(dereferenceable);		KEYWORD(dereferenceable);
KEYWORD(dereferenceable_or_null);		KEYWORD(dereferenceable_or_null);
KEYWORD(inaccessiblememonly);		KEYWORD(inaccessiblememonly);
KEYWORD(inaccessiblemem_or_argmemonly);		KEYWORD(inaccessiblemem_or_argmemonly);
KEYWORD(inlinehint);		KEYWORD(inlinehint);
KEYWORD(inreg);		KEYWORD(inreg);
KEYWORD(jumptable);		KEYWORD(jumptable);
		KEYWORD(mask);
KEYWORD(minsize);		KEYWORD(minsize);
KEYWORD(naked);		KEYWORD(naked);
KEYWORD(nest);		KEYWORD(nest);
KEYWORD(noalias);		KEYWORD(noalias);
KEYWORD(nobuiltin);		KEYWORD(nobuiltin);
KEYWORD(nocapture);		KEYWORD(nocapture);
KEYWORD(noduplicate);		KEYWORD(noduplicate);
KEYWORD(nofree);		KEYWORD(nofree);
KEYWORD(noimplicitfloat);		KEYWORD(noimplicitfloat);
KEYWORD(noinline);		KEYWORD(noinline);
KEYWORD(norecurse);		KEYWORD(norecurse);
KEYWORD(nonlazybind);		KEYWORD(nonlazybind);
KEYWORD(nonnull);		KEYWORD(nonnull);
KEYWORD(noredzone);		KEYWORD(noredzone);
KEYWORD(noreturn);		KEYWORD(noreturn);
KEYWORD(nosync);		KEYWORD(nosync);
KEYWORD(nocf_check);		KEYWORD(nocf_check);
KEYWORD(nounwind);		KEYWORD(nounwind);
KEYWORD(optforfuzzing);		KEYWORD(optforfuzzing);
KEYWORD(optnone);		KEYWORD(optnone);
KEYWORD(optsize);		KEYWORD(optsize);
		KEYWORD(passthru);
KEYWORD(readnone);		KEYWORD(readnone);
KEYWORD(readonly);		KEYWORD(readonly);
KEYWORD(returned);		KEYWORD(returned);
KEYWORD(returns_twice);		KEYWORD(returns_twice);
KEYWORD(signext);		KEYWORD(signext);
KEYWORD(speculatable);		KEYWORD(speculatable);
KEYWORD(sret);		KEYWORD(sret);
KEYWORD(ssp);		KEYWORD(ssp);
KEYWORD(sspreq);		KEYWORD(sspreq);
KEYWORD(sspstrong);		KEYWORD(sspstrong);
KEYWORD(strictfp);		KEYWORD(strictfp);
KEYWORD(safestack);		KEYWORD(safestack);
KEYWORD(shadowcallstack);		KEYWORD(shadowcallstack);
KEYWORD(sanitize_address);		KEYWORD(sanitize_address);
KEYWORD(sanitize_hwaddress);		KEYWORD(sanitize_hwaddress);
KEYWORD(sanitize_memtag);		KEYWORD(sanitize_memtag);
KEYWORD(sanitize_thread);		KEYWORD(sanitize_thread);
KEYWORD(sanitize_memory);		KEYWORD(sanitize_memory);
KEYWORD(speculative_load_hardening);		KEYWORD(speculative_load_hardening);
KEYWORD(swifterror);		KEYWORD(swifterror);
KEYWORD(swiftself);		KEYWORD(swiftself);
KEYWORD(uwtable);		KEYWORD(uwtable);
KEYWORD(willreturn);		KEYWORD(willreturn);
		KEYWORD(vlen);
KEYWORD(writeonly);		KEYWORD(writeonly);
KEYWORD(zeroext);		KEYWORD(zeroext);
KEYWORD(immarg);		KEYWORD(immarg);

KEYWORD(type);		KEYWORD(type);
KEYWORD(opaque);		KEYWORD(opaque);

KEYWORD(comdat);		KEYWORD(comdat);
▲ Show 20 Lines • Show All 450 Lines • Show Last 20 Lines

llvm/lib/AsmParser/LLParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,335 Lines • ▼ Show 20 Lines	case lltok::kw_zeroext:
HaveError \|=		HaveError \|=
Error(Lex.getLoc(),		Error(Lex.getLoc(),
"invalid use of attribute on a function");		"invalid use of attribute on a function");
break;		break;
case lltok::kw_byval:		case lltok::kw_byval:
case lltok::kw_dereferenceable:		case lltok::kw_dereferenceable:
case lltok::kw_dereferenceable_or_null:		case lltok::kw_dereferenceable_or_null:
case lltok::kw_inalloca:		case lltok::kw_inalloca:
		case lltok::kw_mask:
case lltok::kw_nest:		case lltok::kw_nest:
case lltok::kw_noalias:		case lltok::kw_noalias:
case lltok::kw_nocapture:		case lltok::kw_nocapture:
case lltok::kw_nonnull:		case lltok::kw_nonnull:
		case lltok::kw_passthru:
case lltok::kw_returned:		case lltok::kw_returned:
case lltok::kw_sret:		case lltok::kw_sret:
case lltok::kw_swifterror:		case lltok::kw_swifterror:
case lltok::kw_swiftself:		case lltok::kw_swiftself:
case lltok::kw_immarg:		case lltok::kw_immarg:
		case lltok::kw_vlen:
HaveError \|=		HaveError \|=
Error(Lex.getLoc(),		Error(Lex.getLoc(),
"invalid use of parameter-only attribute on a function");		"invalid use of parameter-only attribute on a function");
break;		break;
}		}

Lex.Lex();		Lex.Lex();
}		}
▲ Show 20 Lines • Show All 270 Lines • ▼ Show 20 Lines	case lltok::kw_dereferenceable_or_null: {
uint64_t Bytes;		uint64_t Bytes;
if (ParseOptionalDerefAttrBytes(lltok::kw_dereferenceable_or_null, Bytes))		if (ParseOptionalDerefAttrBytes(lltok::kw_dereferenceable_or_null, Bytes))
return true;		return true;
B.addDereferenceableOrNullAttr(Bytes);		B.addDereferenceableOrNullAttr(Bytes);
continue;		continue;
}		}
case lltok::kw_inalloca: B.addAttribute(Attribute::InAlloca); break;		case lltok::kw_inalloca: B.addAttribute(Attribute::InAlloca); break;
case lltok::kw_inreg: B.addAttribute(Attribute::InReg); break;		case lltok::kw_inreg: B.addAttribute(Attribute::InReg); break;
		case lltok::kw_mask: B.addAttribute(Attribute::Mask); break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case lltok::kw_mask: B.addAttribute(Attribute::Mask); break; + case lltok::kw_mask: + B.addAttribute(Attribute::Mask); + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case lltok::kw_mask: B.addAttribute…
case lltok::kw_nest: B.addAttribute(Attribute::Nest); break;		case lltok::kw_nest: B.addAttribute(Attribute::Nest); break;
case lltok::kw_noalias: B.addAttribute(Attribute::NoAlias); break;		case lltok::kw_noalias: B.addAttribute(Attribute::NoAlias); break;
case lltok::kw_nocapture: B.addAttribute(Attribute::NoCapture); break;		case lltok::kw_nocapture: B.addAttribute(Attribute::NoCapture); break;
case lltok::kw_nofree: B.addAttribute(Attribute::NoFree); break;		case lltok::kw_nofree: B.addAttribute(Attribute::NoFree); break;
case lltok::kw_nonnull: B.addAttribute(Attribute::NonNull); break;		case lltok::kw_nonnull: B.addAttribute(Attribute::NonNull); break;
		case lltok::kw_passthru: B.addAttribute(Attribute::Passthru); break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case lltok::kw_passthru: B.addAttribute(Attribute::Passthru); break; + case lltok::kw_passthru: + B.addAttribute(Attribute::Passthru); + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case lltok::kw_passthru: B.addAttribute…
case lltok::kw_readnone: B.addAttribute(Attribute::ReadNone); break;		case lltok::kw_readnone: B.addAttribute(Attribute::ReadNone); break;
case lltok::kw_readonly: B.addAttribute(Attribute::ReadOnly); break;		case lltok::kw_readonly: B.addAttribute(Attribute::ReadOnly); break;
case lltok::kw_returned: B.addAttribute(Attribute::Returned); break;		case lltok::kw_returned: B.addAttribute(Attribute::Returned); break;
case lltok::kw_signext: B.addAttribute(Attribute::SExt); break;		case lltok::kw_signext: B.addAttribute(Attribute::SExt); break;
case lltok::kw_sret: B.addAttribute(Attribute::StructRet); break;		case lltok::kw_sret: B.addAttribute(Attribute::StructRet); break;
case lltok::kw_swifterror: B.addAttribute(Attribute::SwiftError); break;		case lltok::kw_swifterror: B.addAttribute(Attribute::SwiftError); break;
case lltok::kw_swiftself: B.addAttribute(Attribute::SwiftSelf); break;		case lltok::kw_swiftself: B.addAttribute(Attribute::SwiftSelf); break;
		case lltok::kw_vlen: B.addAttribute(Attribute::VectorLength); break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case lltok::kw_vlen: B.addAttribute(Attribute::VectorLength); break; + case lltok::kw_vlen: + B.addAttribute(Attribute::VectorLength); + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case lltok::kw_vlen: B.addAttribute…
case lltok::kw_writeonly: B.addAttribute(Attribute::WriteOnly); break;		case lltok::kw_writeonly: B.addAttribute(Attribute::WriteOnly); break;
case lltok::kw_zeroext: B.addAttribute(Attribute::ZExt); break;		case lltok::kw_zeroext: B.addAttribute(Attribute::ZExt); break;
case lltok::kw_immarg: B.addAttribute(Attribute::ImmArg); break;		case lltok::kw_immarg: B.addAttribute(Attribute::ImmArg); break;

case lltok::kw_alignstack:		case lltok::kw_alignstack:
case lltok::kw_alwaysinline:		case lltok::kw_alwaysinline:
case lltok::kw_argmemonly:		case lltok::kw_argmemonly:
case lltok::kw_builtin:		case lltok::kw_builtin:
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	while (true) {
case lltok::kw_noalias: B.addAttribute(Attribute::NoAlias); break;		case lltok::kw_noalias: B.addAttribute(Attribute::NoAlias); break;
case lltok::kw_nonnull: B.addAttribute(Attribute::NonNull); break;		case lltok::kw_nonnull: B.addAttribute(Attribute::NonNull); break;
case lltok::kw_signext: B.addAttribute(Attribute::SExt); break;		case lltok::kw_signext: B.addAttribute(Attribute::SExt); break;
case lltok::kw_zeroext: B.addAttribute(Attribute::ZExt); break;		case lltok::kw_zeroext: B.addAttribute(Attribute::ZExt); break;

// Error handling.		// Error handling.
case lltok::kw_byval:		case lltok::kw_byval:
case lltok::kw_inalloca:		case lltok::kw_inalloca:
		case lltok::kw_mask:
case lltok::kw_nest:		case lltok::kw_nest:
case lltok::kw_nocapture:		case lltok::kw_nocapture:
		case lltok::kw_passthru:
case lltok::kw_returned:		case lltok::kw_returned:
case lltok::kw_sret:		case lltok::kw_sret:
case lltok::kw_swifterror:		case lltok::kw_swifterror:
case lltok::kw_swiftself:		case lltok::kw_swiftself:
case lltok::kw_immarg:		case lltok::kw_immarg:
		case lltok::kw_vlen:
HaveError \|= Error(Lex.getLoc(), "invalid use of parameter-only attribute");		HaveError \|= Error(Lex.getLoc(), "invalid use of parameter-only attribute");
break;		break;

case lltok::kw_alignstack:		case lltok::kw_alignstack:
case lltok::kw_alwaysinline:		case lltok::kw_alwaysinline:
case lltok::kw_argmemonly:		case lltok::kw_argmemonly:
case lltok::kw_builtin:		case lltok::kw_builtin:
case lltok::kw_cold:		case lltok::kw_cold:
▲ Show 20 Lines • Show All 7,181 Lines • Show Last 20 Lines

llvm/lib/AsmParser/LLToken.h

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	enum Kind {
kw_convergent,		kw_convergent,
kw_dereferenceable,		kw_dereferenceable,
kw_dereferenceable_or_null,		kw_dereferenceable_or_null,
kw_inaccessiblememonly,		kw_inaccessiblememonly,
kw_inaccessiblemem_or_argmemonly,		kw_inaccessiblemem_or_argmemonly,
kw_inlinehint,		kw_inlinehint,
kw_inreg,		kw_inreg,
kw_jumptable,		kw_jumptable,
		kw_mask,
kw_minsize,		kw_minsize,
kw_naked,		kw_naked,
kw_nest,		kw_nest,
kw_noalias,		kw_noalias,
kw_nobuiltin,		kw_nobuiltin,
kw_nocapture,		kw_nocapture,
kw_noduplicate,		kw_noduplicate,
kw_nofree,		kw_nofree,
kw_noimplicitfloat,		kw_noimplicitfloat,
kw_noinline,		kw_noinline,
kw_norecurse,		kw_norecurse,
kw_nonlazybind,		kw_nonlazybind,
kw_nonnull,		kw_nonnull,
kw_noredzone,		kw_noredzone,
kw_noreturn,		kw_noreturn,
kw_nosync,		kw_nosync,
kw_nocf_check,		kw_nocf_check,
kw_nounwind,		kw_nounwind,
kw_optforfuzzing,		kw_optforfuzzing,
kw_optnone,		kw_optnone,
kw_optsize,		kw_optsize,
		kw_passthru,
kw_readnone,		kw_readnone,
kw_readonly,		kw_readonly,
kw_returned,		kw_returned,
kw_returns_twice,		kw_returns_twice,
kw_signext,		kw_signext,
kw_speculatable,		kw_speculatable,
kw_ssp,		kw_ssp,
kw_sspreq,		kw_sspreq,
kw_sspstrong,		kw_sspstrong,
kw_safestack,		kw_safestack,
kw_shadowcallstack,		kw_shadowcallstack,
kw_sret,		kw_sret,
kw_sanitize_thread,		kw_sanitize_thread,
kw_sanitize_memory,		kw_sanitize_memory,
kw_speculative_load_hardening,		kw_speculative_load_hardening,
kw_strictfp,		kw_strictfp,
kw_swifterror,		kw_swifterror,
kw_swiftself,		kw_swiftself,
kw_uwtable,		kw_uwtable,
kw_willreturn,		kw_willreturn,
		kw_vlen,
kw_writeonly,		kw_writeonly,
kw_zeroext,		kw_zeroext,
kw_immarg,		kw_immarg,

kw_type,		kw_type,
kw_opaque,		kw_opaque,

kw_comdat,		kw_comdat,
▲ Show 20 Lines • Show All 234 Lines • Show Last 20 Lines

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 1,296 Lines • ▼ Show 20 Lines	case Attribute::ArgMemOnly:
llvm_unreachable("argmemonly attribute not supported in raw format");		llvm_unreachable("argmemonly attribute not supported in raw format");
break;		break;
case Attribute::AllocSize:		case Attribute::AllocSize:
llvm_unreachable("allocsize not supported in raw format");		llvm_unreachable("allocsize not supported in raw format");
break;		break;
case Attribute::SanitizeMemTag:		case Attribute::SanitizeMemTag:
llvm_unreachable("sanitize_memtag attribute not supported in raw format");		llvm_unreachable("sanitize_memtag attribute not supported in raw format");
break;		break;
		case Attribute::Mask:
		llvm_unreachable("mask attribute not supported in raw format");
		break;
		case Attribute::VectorLength:
		llvm_unreachable("vlen attribute not supported in raw format");
		break;
		case Attribute::Passthru:
		llvm_unreachable("passthru attribute not supported in raw format");
		break;
}		}
llvm_unreachable("Unsupported attribute type");		llvm_unreachable("Unsupported attribute type");
}		}

static void addRawAttributeValue(AttrBuilder &B, uint64_t Val) {		static void addRawAttributeValue(AttrBuilder &B, uint64_t Val) {
if (!Val) return;		if (!Val) return;

for (Attribute::AttrKind I = Attribute::None; I != Attribute::EndAttrKinds;		for (Attribute::AttrKind I = Attribute::None; I != Attribute::EndAttrKinds;
I = Attribute::AttrKind(I + 1)) {		I = Attribute::AttrKind(I + 1)) {
if (I == Attribute::SanitizeMemTag \|\|		if (I == Attribute::SanitizeMemTag \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (I == Attribute::SanitizeMemTag \|\| - I == Attribute::Dereferenceable \|\| - I == Attribute::DereferenceableOrNull \|\| - I == Attribute::ArgMemOnly \|\| - I == Attribute::AllocSize \|\| - I == Attribute::Mask \|\| - I == Attribute::VectorLength \|\| - I == Attribute::Passthru \|\| + if (I == Attribute::SanitizeMemTag \|\| I == Attribute::Dereferenceable \|\| + I == Attribute::DereferenceableOrNull \|\| I == Attribute::ArgMemOnly \|\| + I == Attribute::AllocSize \|\| I == Attribute::Mask \|\| + I == Attribute::VectorLength \|\| I == Attribute::Passthru \|\| Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (I == Attribute::SanitizeMemTag \|\|…
I == Attribute::Dereferenceable \|\|		I == Attribute::Dereferenceable \|\|
I == Attribute::DereferenceableOrNull \|\|		I == Attribute::DereferenceableOrNull \|\|
I == Attribute::ArgMemOnly \|\|		I == Attribute::ArgMemOnly \|\|
I == Attribute::AllocSize \|\|		I == Attribute::AllocSize \|\|
		I == Attribute::Mask \|\|
		I == Attribute::VectorLength \|\|
		I == Attribute::Passthru \|\|
I == Attribute::NoSync)		I == Attribute::NoSync)
continue;		continue;
if (uint64_t A = (Val & getRawAttributeMask(I))) {		if (uint64_t A = (Val & getRawAttributeMask(I))) {
if (I == Attribute::Alignment)		if (I == Attribute::Alignment)
B.addAlignmentAttr(1ULL << ((A >> 16) - 1));		B.addAlignmentAttr(1ULL << ((A >> 16) - 1));
else if (I == Attribute::StackAlignment)		else if (I == Attribute::StackAlignment)
B.addStackAlignmentAttr(1ULL << ((A >> 26)-1));		B.addStackAlignmentAttr(1ULL << ((A >> 26)-1));
else		else
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
case bitc::ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEMONLY:		case bitc::ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEMONLY:
return Attribute::InaccessibleMemOrArgMemOnly;		return Attribute::InaccessibleMemOrArgMemOnly;
case bitc::ATTR_KIND_INLINE_HINT:		case bitc::ATTR_KIND_INLINE_HINT:
return Attribute::InlineHint;		return Attribute::InlineHint;
case bitc::ATTR_KIND_IN_REG:		case bitc::ATTR_KIND_IN_REG:
return Attribute::InReg;		return Attribute::InReg;
case bitc::ATTR_KIND_JUMP_TABLE:		case bitc::ATTR_KIND_JUMP_TABLE:
return Attribute::JumpTable;		return Attribute::JumpTable;
		case bitc::ATTR_KIND_MASK:
		return Attribute::Mask;
case bitc::ATTR_KIND_MIN_SIZE:		case bitc::ATTR_KIND_MIN_SIZE:
return Attribute::MinSize;		return Attribute::MinSize;
case bitc::ATTR_KIND_NAKED:		case bitc::ATTR_KIND_NAKED:
return Attribute::Naked;		return Attribute::Naked;
case bitc::ATTR_KIND_NEST:		case bitc::ATTR_KIND_NEST:
return Attribute::Nest;		return Attribute::Nest;
case bitc::ATTR_KIND_NO_ALIAS:		case bitc::ATTR_KIND_NO_ALIAS:
return Attribute::NoAlias;		return Attribute::NoAlias;
Show All 32 Lines	static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
case bitc::ATTR_KIND_NO_UNWIND:		case bitc::ATTR_KIND_NO_UNWIND:
return Attribute::NoUnwind;		return Attribute::NoUnwind;
case bitc::ATTR_KIND_OPT_FOR_FUZZING:		case bitc::ATTR_KIND_OPT_FOR_FUZZING:
return Attribute::OptForFuzzing;		return Attribute::OptForFuzzing;
case bitc::ATTR_KIND_OPTIMIZE_FOR_SIZE:		case bitc::ATTR_KIND_OPTIMIZE_FOR_SIZE:
return Attribute::OptimizeForSize;		return Attribute::OptimizeForSize;
case bitc::ATTR_KIND_OPTIMIZE_NONE:		case bitc::ATTR_KIND_OPTIMIZE_NONE:
return Attribute::OptimizeNone;		return Attribute::OptimizeNone;
		case bitc::ATTR_KIND_PASSTHRU:
		return Attribute::Passthru;
case bitc::ATTR_KIND_READ_NONE:		case bitc::ATTR_KIND_READ_NONE:
return Attribute::ReadNone;		return Attribute::ReadNone;
case bitc::ATTR_KIND_READ_ONLY:		case bitc::ATTR_KIND_READ_ONLY:
return Attribute::ReadOnly;		return Attribute::ReadOnly;
case bitc::ATTR_KIND_RETURNED:		case bitc::ATTR_KIND_RETURNED:
return Attribute::Returned;		return Attribute::Returned;
case bitc::ATTR_KIND_RETURNS_TWICE:		case bitc::ATTR_KIND_RETURNS_TWICE:
return Attribute::ReturnsTwice;		return Attribute::ReturnsTwice;
Show All 30 Lines	static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
case bitc::ATTR_KIND_SWIFT_ERROR:		case bitc::ATTR_KIND_SWIFT_ERROR:
return Attribute::SwiftError;		return Attribute::SwiftError;
case bitc::ATTR_KIND_SWIFT_SELF:		case bitc::ATTR_KIND_SWIFT_SELF:
return Attribute::SwiftSelf;		return Attribute::SwiftSelf;
case bitc::ATTR_KIND_UW_TABLE:		case bitc::ATTR_KIND_UW_TABLE:
return Attribute::UWTable;		return Attribute::UWTable;
case bitc::ATTR_KIND_WILLRETURN:		case bitc::ATTR_KIND_WILLRETURN:
return Attribute::WillReturn;		return Attribute::WillReturn;
		case bitc::ATTR_KIND_VECTORLENGTH:
		return Attribute::VectorLength;
case bitc::ATTR_KIND_WRITEONLY:		case bitc::ATTR_KIND_WRITEONLY:
return Attribute::WriteOnly;		return Attribute::WriteOnly;
case bitc::ATTR_KIND_Z_EXT:		case bitc::ATTR_KIND_Z_EXT:
return Attribute::ZExt;		return Attribute::ZExt;
case bitc::ATTR_KIND_IMMARG:		case bitc::ATTR_KIND_IMMARG:
return Attribute::ImmArg;		return Attribute::ImmArg;
case bitc::ATTR_KIND_SANITIZE_MEMTAG:		case bitc::ATTR_KIND_SANITIZE_MEMTAG:
return Attribute::SanitizeMemTag;		return Attribute::SanitizeMemTag;
▲ Show 20 Lines • Show All 5,176 Lines • Show Last 20 Lines

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 672 Lines • ▼ Show 20 Lines	static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
case Attribute::OptimizeNone:		case Attribute::OptimizeNone:
return bitc::ATTR_KIND_OPTIMIZE_NONE;		return bitc::ATTR_KIND_OPTIMIZE_NONE;
case Attribute::ReadNone:		case Attribute::ReadNone:
return bitc::ATTR_KIND_READ_NONE;		return bitc::ATTR_KIND_READ_NONE;
case Attribute::ReadOnly:		case Attribute::ReadOnly:
return bitc::ATTR_KIND_READ_ONLY;		return bitc::ATTR_KIND_READ_ONLY;
case Attribute::Returned:		case Attribute::Returned:
return bitc::ATTR_KIND_RETURNED;		return bitc::ATTR_KIND_RETURNED;
		case Attribute::Mask:
		return bitc::ATTR_KIND_MASK;
		case Attribute::VectorLength:
		return bitc::ATTR_KIND_VECTORLENGTH;
		case Attribute::Passthru:
		return bitc::ATTR_KIND_PASSTHRU;
case Attribute::ReturnsTwice:		case Attribute::ReturnsTwice:
return bitc::ATTR_KIND_RETURNS_TWICE;		return bitc::ATTR_KIND_RETURNS_TWICE;
case Attribute::SExt:		case Attribute::SExt:
return bitc::ATTR_KIND_S_EXT;		return bitc::ATTR_KIND_S_EXT;
case Attribute::Speculatable:		case Attribute::Speculatable:
return bitc::ATTR_KIND_SPECULATABLE;		return bitc::ATTR_KIND_SPECULATABLE;
case Attribute::StackAlignment:		case Attribute::StackAlignment:
return bitc::ATTR_KIND_STACK_ALIGNMENT;		return bitc::ATTR_KIND_STACK_ALIGNMENT;
▲ Show 20 Lines • Show All 4,106 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show All 19 Lines	add_llvm_component_library(LLVMCodeGen
DFAPacketizer.cpp		DFAPacketizer.cpp
DwarfEHPrepare.cpp		DwarfEHPrepare.cpp
EarlyIfConversion.cpp		EarlyIfConversion.cpp
EdgeBundles.cpp		EdgeBundles.cpp
ExecutionDomainFix.cpp		ExecutionDomainFix.cpp
ExpandMemCmp.cpp		ExpandMemCmp.cpp
ExpandPostRAPseudos.cpp		ExpandPostRAPseudos.cpp
ExpandReductions.cpp		ExpandReductions.cpp
		ExpandVectorPredication.cpp
FaultMaps.cpp		FaultMaps.cpp
FEntryInserter.cpp		FEntryInserter.cpp
FinalizeISel.cpp		FinalizeISel.cpp
FuncletLayout.cpp		FuncletLayout.cpp
GCMetadata.cpp		GCMetadata.cpp
GCMetadataPrinter.cpp		GCMetadataPrinter.cpp
GCRootLowering.cpp		GCRootLowering.cpp
GCStrategy.cpp		GCStrategy.cpp
▲ Show 20 Lines • Show All 155 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExpandVectorPredication.cpp

This file was added.

				//===--- CodeGen/ExpandVectorPredication.cpp - Expand VP intrinsics -===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass implements IR expansion for vector predication intrinsics, allowing
				// targets to enable vector predication until just before codegen.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/ExpandVectorPredication.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/Function.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/Intrinsics.h"
				#include "llvm/IR/Module.h"
				#include "llvm/IR/PredicatedInst.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/Debug.h"

				using namespace llvm;

				#define DEBUG_TYPE "expand-vec-pred"

				STATISTIC(NumFoldedVL, "Number of folded vector length params");
				STATISTIC(numLoweredVPOps, "Number of folded vector predication operations");

				/// \returns Whether the vector mask \p MaskVal has all lane bits set.
				static bool IsAllTrueMask(Value *MaskVal) {
				auto ConstVec = dyn_cast<ConstantVector>(MaskVal);
				if (!ConstVec)
				return false;
				return ConstVec->isAllOnesValue();
				}

				/// \returns The constant \p ConstVal broadcasted to \p VecTy.
				static Value BroadcastConstant(Constant ConstVal, VectorType *VecTy) {
				return ConstantDataVector::getSplat(VecTy->getVectorNumElements(), ConstVal);
				}

				/// \returns The neutral element of the reduction \p VPRedID.
				static Value *GetNeutralElementVector(Intrinsic::ID VPRedID,
				VectorType *VecTy) {
				unsigned ElemBits = VecTy->getScalarSizeInBits();

				switch (VPRedID) {
				default:
				abort(); // invalid vp reduction intrinsic

				case Intrinsic::vp_reduce_add:
				case Intrinsic::vp_reduce_or:
				case Intrinsic::vp_reduce_xor:
				case Intrinsic::vp_reduce_umax:
				return Constant::getNullValue(VecTy);

				case Intrinsic::vp_reduce_mul:
				return BroadcastConstant(
				ConstantInt::get(VecTy->getElementType(), 1, false), VecTy);

				case Intrinsic::vp_reduce_and:
				case Intrinsic::vp_reduce_umin:
				return Constant::getAllOnesValue(VecTy);

				case Intrinsic::vp_reduce_smin:
				return BroadcastConstant(
				ConstantInt::get(VecTy->getContext(),
				APInt::getSignedMaxValue(ElemBits)),
				VecTy);
				case Intrinsic::vp_reduce_smax:
				return BroadcastConstant(
				ConstantInt::get(VecTy->getContext(),
				APInt::getSignedMinValue(ElemBits)),
				VecTy);

				case Intrinsic::vp_reduce_fmin:
				case Intrinsic::vp_reduce_fmax:
				return BroadcastConstant(ConstantFP::getQNaN(VecTy->getElementType()),
				VecTy);
				case Intrinsic::vp_reduce_fadd:
				return BroadcastConstant(ConstantFP::get(VecTy->getElementType(), 0.0),
				VecTy);
				case Intrinsic::vp_reduce_fmul:
				return BroadcastConstant(ConstantFP::get(VecTy->getElementType(), 1.0),
				VecTy);
				}
				}

				namespace {

				/// \brief The logical vector element size of this operation.
				int32_t GetFunctionalVectorElementSize() {
				return 64; // TODO infer from operation (eg
				// VPIntrinsic::getVectorElementSize())
				}

				/// \returns A vector with ascending integer indices (<0, 1, ..., NumElems-1>).
				Value *CreateStepVector(IRBuilder<> &Builder, int32_t ElemBits,
				int32_t NumElems) {
				// TODO add caching
				SmallVector<Constant *, 16> ConstElems;

				Type *LaneTy = Builder.getIntNTy(ElemBits);

				for (int32_t Idx = 0; Idx < NumElems; ++Idx) {
				ConstElems.push_back(ConstantInt::get(LaneTy, Idx, false));
				}

				return ConstantVector::get(ConstElems);
				}

				/// \returns A bitmask that is true where the lane position is less-than
				///
				/// \p Builder
				/// Used for instruction creation.
				/// \p VLParam
				/// The explicit vector length parameter to test against the lane
				/// positions.
				// \p ElemBits
				/// Integer bitsize used for the generated ICmp instruction.
				/// \p NumElems
				/// Static vector length of the operation.
				Value ConvertVLToMask(IRBuilder<> &Builder, Value VLParam, int32_t ElemBits,
				int32_t NumElems) {
				// TODO increase elem bits to shrink wrap VLParam where necessary (eg if
				// operating on i8)
				Type *LaneTy = Builder.getIntNTy(ElemBits);

				auto ExtVLParam = Builder.CreateSExt(VLParam, LaneTy);
				auto VLSplat = Builder.CreateVectorSplat(NumElems, ExtVLParam);

				auto IdxVec = CreateStepVector(Builder, ElemBits, NumElems);

				return Builder.CreateICmp(CmpInst::ICMP_ULT, IdxVec, VLSplat);
				}

				/// \returns A non-excepting divisor constant for this type.
				Constant GetSafeDivisor(Type DivTy) {
				if (DivTy->isIntOrIntVectorTy()) {
				return Constant::getAllOnesValue(DivTy);
				}
				if (DivTy->isFPOrFPVectorTy()) {
				return ConstantVector::getSplat(
				DivTy->getVectorNumElements(),
				ConstantFP::get(DivTy->getVectorElementType(), 1.0));
				}
				llvm_unreachable("Not a valid type for division");
				}

				/// Transfer operation properties from \p OldVPI to \p NewVal.
				void TransferDecorations(Value NewVal, VPIntrinsic OldVPI) {
				auto NewInst = dyn_cast<Instruction>(NewVal);
				if (!NewInst \|\| !isa<FPMathOperator>(NewVal))
				return;

				auto OldFMOp = dyn_cast<FPMathOperator>(OldVPI);
				if (!OldFMOp)
				return;

				NewInst->setFastMathFlags(OldFMOp->getFastMathFlags());
				}

				/// Transfer all properties from \p OldOp to \p NewOp and replace all uses.
				/// OldVP gets erased.
				void ReplaceOperation(Value NewOp, VPIntrinsic OldOp) {
				TransferDecorations(NewOp, OldOp);
				OldOp->replaceAllUsesWith(NewOp);
				OldOp->eraseFromParent();
				}

				/// \brief Lower this vector-predicated operator into standard IR.
				void LowerVPUnaryOperator(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());
				auto OC = VPI->getFunctionalOpcode();
				auto FirstOp = VPI->getOperand(0);
				assert(OC == Instruction::FNeg);
				auto I = cast<Instruction>(VPI);
				IRBuilder<> Builder(I);
				auto NewFNeg = Builder.CreateFNegFMF(FirstOp, I, I->getName());
				ReplaceOperation(NewFNeg, VPI);
				}

				/// \brief Lower this VP binary operator to a non-VP binary operator.
				void LowerVPBinaryOperator(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());
				assert(VPI->isBinaryOp());

				auto OldBinOp = cast<Instruction>(VPI);

				auto FirstOp = VPI->getOperand(0);
				auto SndOp = VPI->getOperand(1);

				IRBuilder<> Builder(OldBinOp);
				auto Mask = VPI->getMaskParam();

				// Blend in safe operands
				if (!IsAllTrueMask(Mask)) {
				switch (VPI->getFunctionalOpcode()) {
				default:
				// can safely ignore the predicate
				break;

				// Division operators need a safe divisor on masked-off lanes (1.0)
				case Instruction::FDiv:
				case Instruction::FRem:
				case Instruction::UDiv:
				case Instruction::SDiv:
				case Instruction::URem:
				case Instruction::SRem:
				// 2nd operand must not be zero
				auto SafeDivisor = GetSafeDivisor(VPI->getType());
				SndOp = Builder.CreateSelect(Mask, SndOp, SafeDivisor);
				}
				}

				auto NewBinOp = Builder.CreateBinOp(
				static_cast<Instruction::BinaryOps>(VPI->getFunctionalOpcode()), FirstOp,
				SndOp, VPI->getName(), nullptr);

				ReplaceOperation(NewBinOp, VPI);
				}

				/// \brief Lower this vector-predicated cast operator.
				void LowerVPCastOperator(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());
				assert(!VPI->isConstrainedOp());
				auto OC = VPI->getFunctionalOpcode();
				IRBuilder<> Builder(cast<Instruction>(VPI));
				auto NewCast =
				Builder.CreateCast(static_cast<Instruction::CastOps>(OC),
				VPI->getArgOperand(0), VPI->getType(), VPI->getName());

				ReplaceOperation(NewCast, VPI);
				}

				/// \brief Lower llvm.vp.compose.* into a select instruction
				void LowerVPCompose(VPIntrinsic *VPI) {
				auto ElemBits = GetFunctionalVectorElementSize();
				ElementCount ElemCount = VPI->getVectorLength();
				assert(!ElemCount.Scalable && "TODO scalable type support");

				IRBuilder<> Builder(cast<Instruction>(VPI));
				auto PivotMask =
				ConvertVLToMask(Builder, VPI->getOperand(2), ElemBits, ElemCount.Min);
				auto NewCompose = Builder.CreateSelect(PivotMask, VPI->getOperand(0),
				VPI->getOperand(1), VPI->getName());

				ReplaceOperation(NewCompose, VPI);
				}

				/// \brief Lower this llvm.vp.fma intrinsic to a llvm.fma intrinsic.
				void LowerToIntrinsic(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());

				auto I = cast<Instruction>(VPI);
				auto M = I->getParent()->getModule();
				IRBuilder<> Builder(I);
				Intrinsic::ID IID = VPI->getFunctionalIntrinsicID();
				assert(IID != Intrinsic::not_intrinsic && "cannot lower to non-VP intrinsic");
				assert(!VPI->isConstrainedOp() &&
				"TODO implement lowering to constrained fp");
				assert(!VPIntrinsic::IsVPIntrinsic(IID));

				SmallVector<Type *, 2> IntrinTypeVec;
				IntrinTypeVec.push_back(VPI->getType()); // TODO simplify

				// Implicitly assumes that the return type is sufficient for disambiguation.
				Function *IntrinFunc = Intrinsic::getDeclaration(M, IID, IntrinTypeVec);
				assert(IntrinFunc);

				LLVM_DEBUG(dbgs() << "Using " << *IntrinFunc << " to lower "
				<< VPI->getCalledFunction() << "\n");

				// Construct argument vector.
				assert(!IntrinFunc->getFunctionType()->isVarArg());
				unsigned NumIntrinParams = IntrinFunc->getFunctionType()->getNumParams();
				SmallVector<Value *, 4> IntrinArgs;
				for (unsigned i = 0; i < NumIntrinParams; ++i) {
				IntrinArgs.push_back(VPI->getArgOperand(i));
				}

				auto NewIntrin = Builder.CreateCall(IntrinFunc, IntrinArgs, VPI->getName());

				ReplaceOperation(NewIntrin, VPI);
				}

				/// \brief Lower this llvm.vp.reduce.* intrinsic to a llvm.experimental.reduce.*
				/// intrinsic.
				void LowerVPReduction(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());
				assert(VPI->isReductionOp());

				auto &I = *cast<Instruction>(VPI);
				IRBuilder<> Builder(&I);
				auto M = Builder.GetInsertBlock()->getModule();
				assert(M && "No module to declare reduction intrinsic in!");

				SmallVector<Value *, 3> Args;

				Value *RedVectorParam = VPI->getReductionVectorParam();
				Value *RedAccuParam = VPI->getReductionAccuParam();
				Value *MaskParam = VPI->getMaskParam();
				auto FunctionalID = VPI->getFunctionalIntrinsicID();

				// Insert neutral element in masked-out positions
				bool IsUnmasked = IsAllTrueMask(VPI->getMaskParam());
				if (!IsUnmasked) {
				auto *NeutralVector = GetNeutralElementVector(
				VPI->getIntrinsicID(), cast<VectorType>(RedVectorParam->getType()));
				RedVectorParam =
				Builder.CreateSelect(MaskParam, RedVectorParam, NeutralVector);
				}

				auto VecTypeArg = RedVectorParam->getType();

				Value *NewReduct;
				switch (FunctionalID) {
				default: {
				auto RedIntrinFunc = Intrinsic::getDeclaration(M, FunctionalID, VecTypeArg);
				NewReduct = Builder.CreateCall(RedIntrinFunc, RedVectorParam, I.getName());
				assert(!RedAccuParam && "accu dropped");
				} break;

				case Intrinsic::experimental_vector_reduce_v2_fadd:
				case Intrinsic::experimental_vector_reduce_v2_fmul: {
				auto TypeArg = RedAccuParam->getType();
				auto RedIntrinFunc =
				Intrinsic::getDeclaration(M, FunctionalID, {TypeArg, VecTypeArg});
				NewReduct = Builder.CreateCall(RedIntrinFunc,
				{RedAccuParam, RedVectorParam}, I.getName());
				} break;
				}

				TransferDecorations(NewReduct, VPI);
				I.replaceAllUsesWith(NewReduct);
				I.eraseFromParent();
				}

				/// \brief Lower this llvm.vp.(load\|store\|gather\|scatter) to a non-vp
				/// instruction.
				void LowerVPMemoryIntrinsic(VPIntrinsic *VPI) {
				assert(VPI->canIgnoreVectorLengthParam());
				auto &I = cast<Instruction>(*VPI);

				auto MaskParam = VPI->getMaskParam();
				auto PtrParam = VPI->getMemoryPointerParam();
				auto DataParam = VPI->getMemoryDataParam();
				bool IsUnmasked = IsAllTrueMask(MaskParam);

				IRBuilder<> Builder(&I);
				MaybeAlign AlignOpt = VPI->getPointerAlignment();

				Value *NewMemoryInst = nullptr;
				switch (VPI->getIntrinsicID()) {
				default:
				abort(); // not a VP memory intrinsic

				case Intrinsic::vp_store: {
				if (IsUnmasked) {
				StoreInst *NewStore = Builder.CreateStore(DataParam, PtrParam, false);
				if (AlignOpt.hasValue())
				NewStore->setAlignment(AlignOpt.getValue());
				NewMemoryInst = NewStore;
				} else {
				NewMemoryInst = Builder.CreateMaskedStore(
				DataParam, PtrParam, AlignOpt.valueOrOne(), MaskParam);
				}
				} break;

				case Intrinsic::vp_load: {
				if (IsUnmasked) {
				LoadInst *NewLoad = Builder.CreateLoad(PtrParam, false);
				if (AlignOpt.hasValue())
				NewLoad->setAlignment(AlignOpt.getValue());
				NewMemoryInst = NewLoad;
				} else {
				NewMemoryInst =
				Builder.CreateMaskedLoad(PtrParam, AlignOpt.valueOrOne(), MaskParam);
				}
				} break;

				case Intrinsic::vp_scatter: {
				// if (IsUnmasked) {
				// StoreInst *NewStore = Builder.CreateStore(DataParam, PtrParam, false);
				// if (AlignOpt.hasValue()) NewStore->setAlignment(AlignOpt.getValue());
				// NewMemoryInst = NewStore;
				// } else {
				Align MayAlign; // FIXME = PtrParam->getPointerAlignment(DL).valueOrOne();
				NewMemoryInst = Builder.CreateMaskedScatter(DataParam, PtrParam,
				MayAlign, MaskParam);
				// }
				} break;

				case Intrinsic::vp_gather: {
				// if (IsUnmasked) {
				// LoadInst *NewLoad = Builder.CreateLoad(I.getType(), PtrParam, false);
				// if (AlignOpt.hasValue()) NewLoad->setAlignment(AlignOpt.getValue());
				// NewMemoryInst = NewLoad;
				// } else {
				Align MayAlign; // FIXME = PtrParam->getPointerAlignment(DL).valueOrOne();
				NewMemoryInst = Builder.CreateMaskedGather(PtrParam, MayAlign.value(),
				MaskParam, nullptr, I.getName());
				// }
				} break;
				}

				assert(NewMemoryInst);
				ReplaceOperation(NewMemoryInst, VPI);
				}

				/// \brief Lower llvm.vp.select.* to a select instruction.
				void LowerVPSelectInst(VPIntrinsic *VPI) {
				auto I = cast<Instruction>(VPI);

				auto NewSelect = SelectInst::Create(VPI->getMaskParam(), VPI->getOperand(1),
				VPI->getOperand(2), I->getName(), I, I);
				ReplaceOperation(NewSelect, VPI);
				}

				/// \brief Lower llvm.vp.(icmp\|fcmp) to an icmp or fcmp instruction.
				void LowerVPCompare(VPIntrinsic *VPI) {
				auto NewCmp = CmpInst::Create(
				static_cast<Instruction::OtherOps>(VPI->getFunctionalOpcode()),
				VPI->getCmpPredicate(), VPI->getOperand(0), VPI->getOperand(1),
				VPI->getName(), cast<Instruction>(VPI));
				ReplaceOperation(NewCmp, VPI);
				}

				/// \brief Try to lower this vp_vshift operation.
				bool TryLowerVShift(VPIntrinsic *VPI) {
				// vshift(vec, amount, mask, vlen)

				// cannot lower dynamic shift amount
				auto *SrcVal = VPI->getArgOperand(0);
				auto *AmountVal = VPI->getArgOperand(1);
				if (!isa<ConstantInt>(AmountVal))
				return false;
				int64_t Amount = cast<ConstantInt>(AmountVal)->getSExtValue();

				// cannot lower scalable vector size
				auto ElemCount = VPI->getType()->getVectorElementCount();
				if (ElemCount.Scalable)
				return false;
				int VecWidth = ElemCount.Min;

				auto IntTy = Type::getInt32Ty(VPI->getContext());

				// constitute shuffle mask.
				std::vector<Constant *> Elems;
				for (int i = 0; i < (int)ElemCount.Min; ++i) {
				int64_t SrcLane = i - Amount;
				if (SrcLane < 0 \|\| SrcLane >= VecWidth)
				Elems.push_back(UndefValue::get(IntTy));
				else
				Elems.push_back(ConstantInt::get(IntTy, SrcLane));
				}
				auto *ShuffleMask = ConstantVector::get(Elems);

				auto *V2 = UndefValue::get(SrcVal->getType());

				// Translate to a shuffle
				auto NewI = new ShuffleVectorInst(SrcVal, V2, ShuffleMask, VPI->getName(),
				cast<Instruction>(VPI));
				ReplaceOperation(NewI, VPI);
				return true;
				}

				/// \brief Lower a llvm.vp.* intrinsic that is not functionally equivalent to a
				/// standard IR instruction.
				void LowerUnmatchedVPIntrinsic(VPIntrinsic *VPI) {
				if (VPI->isReductionOp())
				return LowerVPReduction(VPI);

				switch (VPI->getIntrinsicID()) {
				default:
				LowerToIntrinsic(VPI);
				break;

				// Shuffles
				case Intrinsic::vp_compress:
				case Intrinsic::vp_expand:
				case Intrinsic::vp_vshift:
				if (TryLowerVShift(VPI))
				return;

				LLVM_DEBUG(dbgs() << "Silently keeping VP intrinsic: can not substitute: "
				<< *VPI << "\n");
				return;

				case Intrinsic::vp_compose:
				LowerVPCompose(VPI);
				break;

				case Intrinsic::vp_gather:
				case Intrinsic::vp_scatter:
				LowerVPMemoryIntrinsic(VPI);
				break;
				}
				}

				/// \brief Expand llvm.vp.* intrinsics as requested by \p TTI.
				bool expandVectorPredication(Function &F, const TargetTransformInfo *TTI) {
				bool Changed = false;

				// Holds all vector-predicated ops with an effective vector length param that
				// needs to be folded into the mask param.
				SmallVector<VPIntrinsic *, 4> ExpandVLWorklist;

				// Holds all vector-predicated ops that need to translated into non-VP ops.
				SmallVector<VPIntrinsic *, 4> ExpandOpWorklist;

				for (auto &I : instructions(F)) {
				auto *VPI = dyn_cast<VPIntrinsic>(&I);
				if (!VPI)
				continue;

				auto &PI = cast<PredicatedInstruction>(*VPI);

				bool supportsVPOp = TTI->supportsVPOperation(PI);
				bool hasEffectiveVLParam = !VPI->canIgnoreVectorLengthParam();
				bool shouldFoldVLParam =
				!supportsVPOp \|\| TTI->shouldFoldVectorLengthIntoMask(PI);

				LLVM_DEBUG(dbgs() << "Inspecting " << *VPI
				<< "\n:: target-support=" << supportsVPOp
				<< ", effectiveVecLen=" << hasEffectiveVLParam
				<< ", shouldFoldVecLen=" << shouldFoldVLParam << "\n");

				if (shouldFoldVLParam) {
				if (hasEffectiveVLParam && VPI->getMaskParam()) {
				ExpandVLWorklist.push_back(VPI);
				} else {
				ExpandOpWorklist.push_back(VPI);
				}
				}
				}

				// Fold vector-length params into the mask param.
				LLVM_DEBUG(dbgs() << "\n:::: Folding vlen into mask. ::::\n");
				for (VPIntrinsic *VPI : ExpandVLWorklist) {
				++NumFoldedVL;
				Changed = true;

				LLVM_DEBUG(dbgs() << "Folding vlen for op: " << *VPI << '\n');

				IRBuilder<> Builder(cast<Instruction>(VPI));

				Value *OldMaskParam = VPI->getMaskParam();
				Value *OldVLParam = VPI->getVectorLengthParam();
				assert(OldMaskParam && "no mask param to fold the vl param into");
				assert(OldVLParam && "no vector length param to fold away");

				LLVM_DEBUG(dbgs() << "OLD vlen: " << *OldVLParam << '\n');
				LLVM_DEBUG(dbgs() << "OLD mask: " << *OldMaskParam << '\n');

				// Determine the lane bit size that should be used to lower this op
				auto ElemBits = GetFunctionalVectorElementSize();
				ElementCount ElemCount = VPI->getVectorLength();
				assert(!ElemCount.Scalable && "TODO scalable vector support");

				// Lower VL to M
				auto *VLMask =
				ConvertVLToMask(Builder, OldVLParam, ElemBits, ElemCount.Min);
				auto NewMaskParam = Builder.CreateAnd(VLMask, OldMaskParam);
				VPI->setMaskParam(
				NewMaskParam); // FIXME cannot trivially use the PI abstraction here.

				// Disable VL
				auto FullVL = Builder.getInt32(ElemCount.Min);
				VPI->setVectorLengthParam(FullVL);
				assert(VPI->canIgnoreVectorLengthParam() &&
				"transformation did not render the vl param ineffective!");

				LLVM_DEBUG(dbgs() << "NEW vlen: " << *FullVL << '\n');
				LLVM_DEBUG(dbgs() << "NEW mask: " << *NewMaskParam << '\n');

				auto &PI = cast<PredicatedInstruction>(*VPI);
				if (!TTI->supportsVPOperation(PI)) {
				ExpandOpWorklist.push_back(VPI);
				}
				}

				// Translate into non-VP ops
				LLVM_DEBUG(dbgs() << "\n:::: Lowering VP into non-VP ops ::::\n");
				for (VPIntrinsic *VPI : ExpandOpWorklist) {
				++numLoweredVPOps;
				Changed = true;

				LLVM_DEBUG(dbgs() << "Lowering vp op: " << *VPI << '\n');

				// Try lowering to a LLVM instruction first.
				unsigned OC = VPI->getFunctionalOpcode();
				#define FIRST_UNARY_INST(X) unsigned FirstUnOp = X;
				#define LAST_UNARY_INST(X) unsigned LastUnOp = X;
				#define FIRST_BINARY_INST(X) unsigned FirstBinOp = X;
				#define LAST_BINARY_INST(X) unsigned LastBinOp = X;
				#define FIRST_CAST_INST(X) unsigned FirstCastOp = X;
				#define LAST_CAST_INST(X) unsigned LastCastOp = X;
				#include "llvm/IR/Instruction.def"

				if (FirstBinOp <= OC && OC <= LastBinOp) {
				LowerVPBinaryOperator(VPI);
				continue;
				}
				if (FirstUnOp <= OC && OC <= LastUnOp) {
				LowerVPUnaryOperator(VPI);
				continue;
				}
				if (FirstCastOp <= OC && OC <= LastCastOp) {
				LowerVPCastOperator(VPI);
				continue;
				}

				// Lower to a non-VP intrinsic.
				switch (OC) {
				default:
				abort(); // unexpected intrinsic

				case Instruction::Call:
				LowerUnmatchedVPIntrinsic(VPI);
				break;

				case Instruction::Select:
				LowerVPSelectInst(VPI);
				break;

				case Instruction::Store:
				case Instruction::Load:
				LowerVPMemoryIntrinsic(VPI);
				break;

				case Instruction::ICmp:
				case Instruction::FCmp:
				LowerVPCompare(VPI);
				break;
				}
				}

				return Changed;
				}

				class ExpandVectorPredication : public FunctionPass {
				public:
				static char ID;
				ExpandVectorPredication() : FunctionPass(ID) {
				initializeExpandVectorPredicationPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &F) override {
				const auto *TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
				return expandVectorPredication(F, TTI);
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetTransformInfoWrapperPass>();
				AU.setPreservesCFG();
				}
				};
				} // namespace

				char ExpandVectorPredication::ID;
				INITIALIZE_PASS_BEGIN(ExpandVectorPredication, "expand-vec-pred",
				"Expand vector predication intrinsics", false, false)
				INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
				INITIALIZE_PASS_END(ExpandVectorPredication, "expand-vec-pred",
				"Expand vector predication intrinsics", false, false)

				FunctionPass *llvm::createExpandVectorPredicationPass() {
				return new ExpandVectorPredication();
				}

				PreservedAnalyses
				ExpandVectorPredicationPass::run(Function &F, FunctionAnalysisManager &AM) {
				const auto &TTI = AM.getResult<TargetIRAnalysis>(F);
				if (!expandVectorPredication(F, &TTI))
				return PreservedAnalyses::all();
				PreservedAnalyses PA;
				PA.preserveSet<CFGAnalyses>();
				return PA;
				}

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 431 Lines • ▼ Show 20 Lines	private:
SDValue visitAssertExt(SDNode *N);		SDValue visitAssertExt(SDNode *N);
SDValue visitSIGN_EXTEND_INREG(SDNode *N);		SDValue visitSIGN_EXTEND_INREG(SDNode *N);
SDValue visitSIGN_EXTEND_VECTOR_INREG(SDNode *N);		SDValue visitSIGN_EXTEND_VECTOR_INREG(SDNode *N);
SDValue visitZERO_EXTEND_VECTOR_INREG(SDNode *N);		SDValue visitZERO_EXTEND_VECTOR_INREG(SDNode *N);
SDValue visitTRUNCATE(SDNode *N);		SDValue visitTRUNCATE(SDNode *N);
SDValue visitBITCAST(SDNode *N);		SDValue visitBITCAST(SDNode *N);
SDValue visitBUILD_PAIR(SDNode *N);		SDValue visitBUILD_PAIR(SDNode *N);
SDValue visitFADD(SDNode *N);		SDValue visitFADD(SDNode *N);
		SDValue visitFADD_VP(SDNode *N);
SDValue visitFSUB(SDNode *N);		SDValue visitFSUB(SDNode *N);
SDValue visitFMUL(SDNode *N);		SDValue visitFMUL(SDNode *N);
SDValue visitFMA(SDNode *N);		SDValue visitFMA(SDNode *N);
SDValue visitFDIV(SDNode *N);		SDValue visitFDIV(SDNode *N);
SDValue visitFREM(SDNode *N);		SDValue visitFREM(SDNode *N);
SDValue visitFSQRT(SDNode *N);		SDValue visitFSQRT(SDNode *N);
SDValue visitFCOPYSIGN(SDNode *N);		SDValue visitFCOPYSIGN(SDNode *N);
SDValue visitFPOW(SDNode *N);		SDValue visitFPOW(SDNode *N);
Show All 32 Lines	private:
SDValue visitMLOAD(SDNode *N);		SDValue visitMLOAD(SDNode *N);
SDValue visitMSTORE(SDNode *N);		SDValue visitMSTORE(SDNode *N);
SDValue visitMGATHER(SDNode *N);		SDValue visitMGATHER(SDNode *N);
SDValue visitMSCATTER(SDNode *N);		SDValue visitMSCATTER(SDNode *N);
SDValue visitFP_TO_FP16(SDNode *N);		SDValue visitFP_TO_FP16(SDNode *N);
SDValue visitFP16_TO_FP(SDNode *N);		SDValue visitFP16_TO_FP(SDNode *N);
SDValue visitVECREDUCE(SDNode *N);		SDValue visitVECREDUCE(SDNode *N);

		template<class MatchContextClass>
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template<class MatchContextClass> + template <class MatchContextClass> Lint: Pre-merge checks: clang-format: please reformat the code ``` - template<class MatchContextClass> + template…
SDValue visitFADDForFMACombine(SDNode *N);		SDValue visitFADDForFMACombine(SDNode *N);
SDValue visitFSUBForFMACombine(SDNode *N);		SDValue visitFSUBForFMACombine(SDNode *N);
SDValue visitFMULForFMADistributiveCombine(SDNode *N);		SDValue visitFMULForFMADistributiveCombine(SDNode *N);

SDValue XformToShuffleWithZero(SDNode *N);		SDValue XformToShuffleWithZero(SDNode *N);
bool reassociationCanBreakAddressingModePattern(unsigned Opc,		bool reassociationCanBreakAddressingModePattern(unsigned Opc,
const SDLoc &DL, SDValue N0,		const SDLoc &DL, SDValue N0,
SDValue N1);		SDValue N1);
▲ Show 20 Lines • Show All 246 Lines • ▼ Show 20 Lines	public:
explicit WorklistInserter(DAGCombiner &dc)		explicit WorklistInserter(DAGCombiner &dc)
: SelectionDAG::DAGUpdateListener(dc.getDAG()), DC(dc) {}		: SelectionDAG::DAGUpdateListener(dc.getDAG()), DC(dc) {}

// FIXME: Ideally we could add N to the worklist, but this causes exponential		// FIXME: Ideally we could add N to the worklist, but this causes exponential
// compile time costs in large DAGs, e.g. Halide.		// compile time costs in large DAGs, e.g. Halide.
void NodeInserted(SDNode *N) override { DC.ConsiderForPruning(N); }		void NodeInserted(SDNode *N) override { DC.ConsiderForPruning(N); }
};		};

		struct EmptyMatchContext {
		SelectionDAG & DAG;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SelectionDAG & DAG; + SelectionDAG &DAG; Lint: Pre-merge checks: clang-format: please reformat the code ``` - SelectionDAG & DAG; + SelectionDAG &DAG; ```

		EmptyMatchContext(SelectionDAG & DAG, SDNode * Root)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - EmptyMatchContext(SelectionDAG & DAG, SDNode * Root) - : DAG(DAG) - {} + EmptyMatchContext(SelectionDAG &DAG, SDNode Root) : DAG(DAG) {} Lint: Pre-merge checks:* clang-format: please reformat the code ``` - EmptyMatchContext(SelectionDAG & DAG, SDNode *…
		: DAG(DAG)
		{}

		bool match(SDValue OpN, unsigned OpCode) const { return OpCode == OpN->getOpcode(); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool match(SDValue OpN, unsigned OpCode) const { return OpCode == OpN->getOpcode(); } - - unsigned getFunctionOpCode(SDValue N) const { - return N->getOpcode(); + bool match(SDValue OpN, unsigned OpCode) const { + return OpCode == OpN->getOpcode(); Lint: Pre-merge checks: clang-format: please reformat the code ``` - bool match(SDValue OpN, unsigned OpCode) const {…

		unsigned getFunctionOpCode(SDValue N) const {
		return N->getOpcode();
		}

		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code + unsigned getFunctionOpCode(SDValue N) const { return N->getOpcode(); } + Lint: Pre-merge checks: clang-format: please reformat the code ``` + unsigned getFunctionOpCode(SDValue N) const {…
		bool isCompatible(SDValue OpVal) const { return true; }

		// Specialize based on number of operands.
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { return DAG.getNode(Opcode, DL, VT); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { return DAG.getNode(Opcode, DL, VT); } + SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { + return DAG.getNode(Opcode, DL, VT); + } Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue getNode(unsigned Opcode, const SDLoc &DL…
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue Operand,
		const SDNodeFlags Flags = SDNodeFlags()) {
		return DAG.getNode(Opcode, DL, VT, Operand, Flags);
		}
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, const SDNodeFlags Flags = SDNodeFlags()) {
		return DAG.getNode(Opcode, DL, VT, N1, N2, Flags);
		}
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, SDValue N3,
		const SDNodeFlags Flags = SDNodeFlags()) {
		return DAG.getNode(Opcode, DL, VT, N1, N2, N3, Flags);
		}

		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, SDValue N3, SDValue N4) {
		return DAG.getNode(Opcode, DL, VT, N1, N2, N3, N4);
		}

		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, SDValue N3, SDValue N4, SDValue N5) {
		return DAG.getNode(Opcode, DL, VT, N1, N2, N3, N4, N5);
		}
		};

		struct
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -struct -VPMatchContext { - SelectionDAG & DAG; - SDNode * Root; +struct VPMatchContext { + SelectionDAG &DAG; + SDNode Root; Lint: Pre-merge checks:* clang-format: please reformat the code ``` -struct -VPMatchContext { - SelectionDAG & DAG…
		VPMatchContext {
		SelectionDAG & DAG;
		SDNode * Root;
		SDValue RootMaskOp;
		SDValue RootVectorLenOp;

		VPMatchContext(SelectionDAG & DAG, SDNode * Root)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPMatchContext(SelectionDAG & DAG, SDNode * Root) - : DAG(DAG) - , Root(Root) - , RootMaskOp() - , RootVectorLenOp() - { + VPMatchContext(SelectionDAG &DAG, SDNode Root) + : DAG(DAG), Root(Root), RootMaskOp(), RootVectorLenOp() { Lint: Pre-merge checks:* clang-format: please reformat the code ``` - VPMatchContext(SelectionDAG & DAG, SDNode * Root)…
		: DAG(DAG)
		, Root(Root)
		, RootMaskOp()
		, RootVectorLenOp()
		{
		if (Root->isVP()) {
		int RootMaskPos = ISD::GetMaskPosVP(Root->getOpcode());
		if (RootMaskPos != -1) {
		RootMaskOp = Root->getOperand(RootMaskPos);
		}

		int RootVLenPos = ISD::GetVectorLengthPosVP(Root->getOpcode());
		if (RootVLenPos != -1) {
		RootVectorLenOp = Root->getOperand(RootVLenPos);
		}
		}
		}

		unsigned getFunctionOpCode(SDValue N) const {
		unsigned VPOpCode = N->getOpcode();
		return ISD::GetFunctionOpCodeForVP(VPOpCode, !N->getFlags().hasNoFPExcept());
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return ISD::GetFunctionOpCodeForVP(VPOpCode, !N->getFlags().hasNoFPExcept()); + return ISD::GetFunctionOpCodeForVP(VPOpCode, + !N->getFlags().hasNoFPExcept()); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return ISD::GetFunctionOpCodeForVP(VPOpCode, !N…
		}

		bool isCompatible(SDValue OpVal) const {
		if (!OpVal->isVP()) {
		return !Root->isVP();

		} else {
		unsigned VPOpCode = OpVal->getOpcode();
		int MaskPos = ISD::GetMaskPosVP(VPOpCode);
		if (MaskPos != -1 && RootMaskOp != OpVal.getOperand(MaskPos)) {
		return false;
		}

		int VLenPos = ISD::GetVectorLengthPosVP(VPOpCode);
		if (VLenPos != -1 && RootVectorLenOp != OpVal.getOperand(VLenPos)) {
		return false;
		}

		return true;
		}
		}

		/// whether \p OpN is a node that is functionally compatible with the NodeType \p OpNodeTy
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// whether \p OpN is a node that is functionally compatible with the NodeType \p OpNodeTy + /// whether \p OpN is a node that is functionally compatible with the NodeType + /// \p OpNodeTy Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// whether \p OpN is a node that is functionally…
		bool match(SDValue OpVal, unsigned OpNT) const {
		return isCompatible(OpVal) && getFunctionOpCode(OpVal) == OpNT;
		}

		// Specialize based on number of operands.
		// TODO emit VP intrinsics where MaskOp/VectorLenOp != null
		// SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { return DAG.getNode(Opcode, DL, VT); }
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { return DAG.getNode(Opcode, DL, VT); } + // SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT) { return + // DAG.getNode(Opcode, DL, VT); } Lint: Pre-merge checks: clang-format: please reformat the code ``` - // SDValue getNode(unsigned Opcode, const SDLoc…
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue Operand,
		const SDNodeFlags Flags = SDNodeFlags()) {
		unsigned VPOpcode = ISD::GetVPForFunctionOpCode(Opcode);
		int MaskPos = ISD::GetMaskPosVP(VPOpcode);
		int VLenPos = ISD::GetVectorLengthPosVP(VPOpcode);
		assert(MaskPos == 1 && VLenPos == 2);

		return DAG.getNode(VPOpcode, DL, VT, {Operand, RootMaskOp, RootVectorLenOp}, Flags);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return DAG.getNode(VPOpcode, DL, VT, {Operand, RootMaskOp, RootVectorLenOp}, Flags); + return DAG.getNode(VPOpcode, DL, VT, {Operand, RootMaskOp, RootVectorLenOp}, + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return DAG.getNode(VPOpcode, DL, VT, {Operand…
		}
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, const SDNodeFlags Flags = SDNodeFlags()) {
		unsigned VPOpcode = ISD::GetVPForFunctionOpCode(Opcode);
		int MaskPos = ISD::GetMaskPosVP(VPOpcode);
		int VLenPos = ISD::GetVectorLengthPosVP(VPOpcode);
		assert(MaskPos == 2 && VLenPos == 3);

		return DAG.getNode(VPOpcode, DL, VT, {N1, N2, RootMaskOp, RootVectorLenOp}, Flags);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return DAG.getNode(VPOpcode, DL, VT, {N1, N2, RootMaskOp, RootVectorLenOp}, Flags); + return DAG.getNode(VPOpcode, DL, VT, {N1, N2, RootMaskOp, RootVectorLenOp}, + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return DAG.getNode(VPOpcode, DL, VT, {N1, N2…
		}
		SDValue getNode(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1,
		SDValue N2, SDValue N3,
		const SDNodeFlags Flags = SDNodeFlags()) {
		unsigned VPOpcode = ISD::GetVPForFunctionOpCode(Opcode);
		int MaskPos = ISD::GetMaskPosVP(VPOpcode);
		int VLenPos = ISD::GetVectorLengthPosVP(VPOpcode);
		assert(MaskPos == 3 && VLenPos == 4);

		return DAG.getNode(VPOpcode, DL, VT, {N1, N2, N3, RootMaskOp, RootVectorLenOp}, Flags);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return DAG.getNode(VPOpcode, DL, VT, {N1, N2, N3, RootMaskOp, RootVectorLenOp}, Flags); + return DAG.getNode(VPOpcode, DL, VT, + {N1, N2, N3, RootMaskOp, RootVectorLenOp}, Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return DAG.getNode(VPOpcode, DL, VT, {N1, N2…
		}
		};

} // end anonymous namespace		} // end anonymous namespace

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// TargetLowering::DAGCombinerInfo implementation		// TargetLowering::DAGCombinerInfo implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void TargetLowering::DAGCombinerInfo::AddToWorklist(SDNode *N) {		void TargetLowering::DAGCombinerInfo::AddToWorklist(SDNode *N) {
((DAGCombiner*)DC)->AddToWorklist(N);		((DAGCombiner*)DC)->AddToWorklist(N);
▲ Show 20 Lines • Show All 819 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visit(SDNode *N) {
case ISD::AssertZext: return visitAssertExt(N);		case ISD::AssertZext: return visitAssertExt(N);
case ISD::SIGN_EXTEND_INREG: return visitSIGN_EXTEND_INREG(N);		case ISD::SIGN_EXTEND_INREG: return visitSIGN_EXTEND_INREG(N);
case ISD::SIGN_EXTEND_VECTOR_INREG: return visitSIGN_EXTEND_VECTOR_INREG(N);		case ISD::SIGN_EXTEND_VECTOR_INREG: return visitSIGN_EXTEND_VECTOR_INREG(N);
case ISD::ZERO_EXTEND_VECTOR_INREG: return visitZERO_EXTEND_VECTOR_INREG(N);		case ISD::ZERO_EXTEND_VECTOR_INREG: return visitZERO_EXTEND_VECTOR_INREG(N);
case ISD::TRUNCATE: return visitTRUNCATE(N);		case ISD::TRUNCATE: return visitTRUNCATE(N);
case ISD::BITCAST: return visitBITCAST(N);		case ISD::BITCAST: return visitBITCAST(N);
case ISD::BUILD_PAIR: return visitBUILD_PAIR(N);		case ISD::BUILD_PAIR: return visitBUILD_PAIR(N);
case ISD::FADD: return visitFADD(N);		case ISD::FADD: return visitFADD(N);
		case ISD::VP_FADD: return visitFADD_VP(N);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case ISD::VP_FADD: return visitFADD_VP(N); + case ISD::VP_FADD: + return visitFADD_VP(N); Lint: Pre-merge checks: clang-format: please reformat the code ``` - case ISD::VP_FADD: return visitFADD_VP…
case ISD::FSUB: return visitFSUB(N);		case ISD::FSUB: return visitFSUB(N);
case ISD::FMUL: return visitFMUL(N);		case ISD::FMUL: return visitFMUL(N);
case ISD::FMA: return visitFMA(N);		case ISD::FMA: return visitFMA(N);
case ISD::FDIV: return visitFDIV(N);		case ISD::FDIV: return visitFDIV(N);
case ISD::FREM: return visitFREM(N);		case ISD::FREM: return visitFREM(N);
case ISD::FSQRT: return visitFSQRT(N);		case ISD::FSQRT: return visitFSQRT(N);
case ISD::FCOPYSIGN: return visitFCOPYSIGN(N);		case ISD::FCOPYSIGN: return visitFCOPYSIGN(N);
case ISD::FPOW: return visitFPOW(N);		case ISD::FPOW: return visitFPOW(N);
▲ Show 20 Lines • Show All 9,994 Lines • ▼ Show 20 Lines	ConstantFoldBITCASTofBUILD_VECTOR(SDNode *BV, EVT DstEltVT) {
return DAG.getBuildVector(VT, DL, Ops);		return DAG.getBuildVector(VT, DL, Ops);
}		}

static bool isContractable(SDNode *N) {		static bool isContractable(SDNode *N) {
SDNodeFlags F = N->getFlags();		SDNodeFlags F = N->getFlags();
return F.hasAllowContract() \|\| F.hasAllowReassociation();		return F.hasAllowContract() \|\| F.hasAllowReassociation();
}		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
/// Try to perform FMA combining on a given FADD node.		/// Try to perform FMA combining on a given FADD node.
		template<class MatchContextClass>
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<class MatchContextClass> +template <class MatchContextClass> Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<class MatchContextClass> +template <class…
SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) {		SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDLoc SL(N);		SDLoc SL(N);

		MatchContextClass matcher(DAG, N);
		if (!matcher.isCompatible(N0) \|\| !matcher.isCompatible(N1)) return SDValue();
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!matcher.isCompatible(N0) \|\| !matcher.isCompatible(N1)) return SDValue(); + if (!matcher.isCompatible(N0) \|\| !matcher.isCompatible(N1)) + return SDValue(); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!matcher.isCompatible(N0) \|\| !matcher.

const TargetOptions &Options = DAG.getTarget().Options;		const TargetOptions &Options = DAG.getTarget().Options;

// Floating-point multiply-add with intermediate rounding.		// Floating-point multiply-add with intermediate rounding.
bool HasFMAD = (LegalOperations && TLI.isFMADLegalForFAddFSub(DAG, N));		bool HasFMAD = (LegalOperations && TLI.isFMADLegalForFAddFSub(DAG, N));

// Floating-point multiply-add without intermediate rounding.		// Floating-point multiply-add without intermediate rounding.
bool HasFMA =		bool HasFMA =
TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) &&		TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) &&
Show All 16 Lines	if (STI && STI->generateFMAsInMachineCombiner(OptLevel))
return SDValue();		return SDValue();

// Always prefer FMAD to FMA for precision.		// Always prefer FMAD to FMA for precision.
unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA;		unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA;
bool Aggressive = TLI.enableAggressiveFMAFusion(VT);		bool Aggressive = TLI.enableAggressiveFMAFusion(VT);

// Is the node an FMUL and contractable either due to global flags or		// Is the node an FMUL and contractable either due to global flags or
// SDNodeFlags.		// SDNodeFlags.
auto isContractableFMUL = [AllowFusionGlobally](SDValue N) {		auto isContractableFMUL = [AllowFusionGlobally, &matcher](SDValue N) {
if (N.getOpcode() != ISD::FMUL)		if (!matcher.match(N, ISD::FMUL))
return false;		return false;
return AllowFusionGlobally \|\| isContractable(N.getNode());		return AllowFusionGlobally \|\| isContractable(N.getNode());
};		};
// If we have two choices trying to fold (fadd (fmul u, v), (fmul x, y)),		// If we have two choices trying to fold (fadd (fmul u, v), (fmul x, y)),
// prefer to fold the multiply with fewer uses.		// prefer to fold the multiply with fewer uses.
if (Aggressive && isContractableFMUL(N0) && isContractableFMUL(N1)) {		if (Aggressive && isContractableFMUL(N0) && isContractableFMUL(N1)) {
if (N0.getNode()->use_size() > N1.getNode()->use_size())		if (N0.getNode()->use_size() > N1.getNode()->use_size())
std::swap(N0, N1);		std::swap(N0, N1);
}		}

// fold (fadd (fmul x, y), z) -> (fma x, y, z)		// fold (fadd (fmul x, y), z) -> (fma x, y, z)
if (isContractableFMUL(N0) && (Aggressive \|\| N0->hasOneUse())) {		if (isContractableFMUL(N0) && (Aggressive \|\| N0->hasOneUse())) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return matcher.getNode(PreferredFusedOpcode, SL, VT, - N0.getOperand(0), N0.getOperand(1), N1, Flags); + return matcher.getNode(PreferredFusedOpcode, SL, VT, N0.getOperand(0), + N0.getOperand(1), N1, Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return matcher.getNode(PreferredFusedOpcode, SL…
N0.getOperand(0), N0.getOperand(1), N1, Flags);		N0.getOperand(0), N0.getOperand(1), N1, Flags);
}		}

// fold (fadd x, (fmul y, z)) -> (fma y, z, x)		// fold (fadd x, (fmul y, z)) -> (fma y, z, x)
// Note: Commutes FADD operands.		// Note: Commutes FADD operands.
if (isContractableFMUL(N1) && (Aggressive \|\| N1->hasOneUse())) {		if (isContractableFMUL(N1) && (Aggressive \|\| N1->hasOneUse())) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return matcher.getNode(PreferredFusedOpcode, SL, VT, - N1.getOperand(0), N1.getOperand(1), N0, Flags); + return matcher.getNode(PreferredFusedOpcode, SL, VT, N1.getOperand(0), + N1.getOperand(1), N0, Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return matcher.getNode(PreferredFusedOpcode, SL…
N1.getOperand(0), N1.getOperand(1), N0, Flags);		N1.getOperand(0), N1.getOperand(1), N0, Flags);
}		}

// Look through FP_EXTEND nodes to do more combining.		// Look through FP_EXTEND nodes to do more combining.

// fold (fadd (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), z)		// fold (fadd (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), z)
if (N0.getOpcode() == ISD::FP_EXTEND) {		if ((N0.getOpcode() == ISD::FP_EXTEND) && matcher.isCompatible(N0.getOperand(0))) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if ((N0.getOpcode() == ISD::FP_EXTEND) && matcher.isCompatible(N0.getOperand(0))) { + if ((N0.getOpcode() == ISD::FP_EXTEND) && + matcher.isCompatible(N0.getOperand(0))) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - if ((N0.getOpcode() == ISD::FP_EXTEND) && matcher.
SDValue N00 = N0.getOperand(0);		SDValue N00 = N0.getOperand(0);
if (isContractableFMUL(N00) &&		if (isContractableFMUL(N00) &&
TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,		TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,
N00.getValueType())) {		N00.getValueType())) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return matcher.getNode(PreferredFusedOpcode, SL, VT, - DAG.getNode(ISD::FP_EXTEND, SL, VT, - N00.getOperand(0)), - matcher.getNode(ISD::FP_EXTEND, SL, VT, - N00.getOperand(1)), N1, Flags); + return matcher.getNode( + PreferredFusedOpcode, SL, VT, + DAG.getNode(ISD::FP_EXTEND, SL, VT, N00.getOperand(0)), + matcher.getNode(ISD::FP_EXTEND, SL, VT, N00.getOperand(1)), N1, + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return matcher.getNode(PreferredFusedOpcode…
DAG.getNode(ISD::FP_EXTEND, SL, VT,		DAG.getNode(ISD::FP_EXTEND, SL, VT,
N00.getOperand(0)),		N00.getOperand(0)),
DAG.getNode(ISD::FP_EXTEND, SL, VT,		matcher.getNode(ISD::FP_EXTEND, SL, VT,
N00.getOperand(1)), N1, Flags);		N00.getOperand(1)), N1, Flags);
}		}
}		}

// fold (fadd x, (fpext (fmul y, z))) -> (fma (fpext y), (fpext z), x)		// fold (fadd x, (fpext (fmul y, z))) -> (fma (fpext y), (fpext z), x)
// Note: Commutes FADD operands.		// Note: Commutes FADD operands.
if (N1.getOpcode() == ISD::FP_EXTEND) {		if (matcher.match(N1, ISD::FP_EXTEND)) {
SDValue N10 = N1.getOperand(0);		SDValue N10 = N1.getOperand(0);
if (isContractableFMUL(N10) &&		if (isContractableFMUL(N10) &&
TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,		TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,
N10.getValueType())) {		N10.getValueType())) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return matcher.getNode(PreferredFusedOpcode, SL, VT, - matcher.getNode(ISD::FP_EXTEND, SL, VT, - N10.getOperand(0)), - matcher.getNode(ISD::FP_EXTEND, SL, VT, - N10.getOperand(1)), N0, Flags); + return matcher.getNode( + PreferredFusedOpcode, SL, VT, + matcher.getNode(ISD::FP_EXTEND, SL, VT, N10.getOperand(0)), + matcher.getNode(ISD::FP_EXTEND, SL, VT, N10.getOperand(1)), N0, + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return matcher.getNode(PreferredFusedOpcode…
DAG.getNode(ISD::FP_EXTEND, SL, VT,		matcher.getNode(ISD::FP_EXTEND, SL, VT,
N10.getOperand(0)),		N10.getOperand(0)),
DAG.getNode(ISD::FP_EXTEND, SL, VT,		matcher.getNode(ISD::FP_EXTEND, SL, VT,
N10.getOperand(1)), N0, Flags);		N10.getOperand(1)), N0, Flags);
}		}
}		}

// More folding opportunities when target permits.		// More folding opportunities when target permits.
if (Aggressive) {		if (Aggressive) {
// fold (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y (fma u, v, z))		// fold (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y (fma u, v, z))
if (CanFuse &&		if (CanFuse &&
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (CanFuse && - matcher.match(N0, PreferredFusedOpcode) && - matcher.match(N0.getOperand(2), ISD::FMUL) && - N0->hasOneUse() && N0.getOperand(2)->hasOneUse()) { - return matcher.getNode(PreferredFusedOpcode, SL, VT, - N0.getOperand(0), N0.getOperand(1), - matcher.getNode(PreferredFusedOpcode, SL, VT, - N0.getOperand(2).getOperand(0), - N0.getOperand(2).getOperand(1), - N1, Flags), Flags); + if (CanFuse && matcher.match(N0, PreferredFusedOpcode) && + matcher.match(N0.getOperand(2), ISD::FMUL) && N0->hasOneUse() && + N0.getOperand(2)->hasOneUse()) { + return matcher.getNode( + PreferredFusedOpcode, SL, VT, N0.getOperand(0), N0.getOperand(1), + matcher.getNode(PreferredFusedOpcode, SL, VT, + N0.getOperand(2).getOperand(0), + N0.getOperand(2).getOperand(1), N1, Flags), + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (CanFuse && - matcher.match(N0…
N0.getOpcode() == PreferredFusedOpcode &&		matcher.match(N0, PreferredFusedOpcode) &&
N0.getOperand(2).getOpcode() == ISD::FMUL &&		matcher.match(N0.getOperand(2), ISD::FMUL) &&
N0->hasOneUse() && N0.getOperand(2)->hasOneUse()) {		N0->hasOneUse() && N0.getOperand(2)->hasOneUse()) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
N0.getOperand(0), N0.getOperand(1),		N0.getOperand(0), N0.getOperand(1),
DAG.getNode(PreferredFusedOpcode, SL, VT,		matcher.getNode(PreferredFusedOpcode, SL, VT,
N0.getOperand(2).getOperand(0),		N0.getOperand(2).getOperand(0),
N0.getOperand(2).getOperand(1),		N0.getOperand(2).getOperand(1),
N1, Flags), Flags);		N1, Flags), Flags);
}		}

// fold (fadd x, (fma y, z, (fmul u, v)) -> (fma y, z (fma u, v, x))		// fold (fadd x, (fma y, z, (fmul u, v)) -> (fma y, z (fma u, v, x))
if (CanFuse &&		if (CanFuse &&
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (CanFuse && - matcher.match(N1, PreferredFusedOpcode) && - matcher.match(N1.getOperand(2), ISD::FMUL) && - N1->hasOneUse() && N1.getOperand(2)->hasOneUse()) { - return matcher.getNode(PreferredFusedOpcode, SL, VT, - N1.getOperand(0), N1.getOperand(1), - matcher.getNode(PreferredFusedOpcode, SL, VT, - N1.getOperand(2).getOperand(0), - N1.getOperand(2).getOperand(1), - N0, Flags), Flags); + if (CanFuse && matcher.match(N1, PreferredFusedOpcode) && + matcher.match(N1.getOperand(2), ISD::FMUL) && N1->hasOneUse() && + N1.getOperand(2)->hasOneUse()) { + return matcher.getNode( + PreferredFusedOpcode, SL, VT, N1.getOperand(0), N1.getOperand(1), + matcher.getNode(PreferredFusedOpcode, SL, VT, + N1.getOperand(2).getOperand(0), + N1.getOperand(2).getOperand(1), N0, Flags), + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (CanFuse && - matcher.match(N1…
N1->getOpcode() == PreferredFusedOpcode &&		matcher.match(N1, PreferredFusedOpcode) &&
N1.getOperand(2).getOpcode() == ISD::FMUL &&		matcher.match(N1.getOperand(2), ISD::FMUL) &&
N1->hasOneUse() && N1.getOperand(2)->hasOneUse()) {		N1->hasOneUse() && N1.getOperand(2)->hasOneUse()) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
N1.getOperand(0), N1.getOperand(1),		N1.getOperand(0), N1.getOperand(1),
DAG.getNode(PreferredFusedOpcode, SL, VT,		matcher.getNode(PreferredFusedOpcode, SL, VT,
N1.getOperand(2).getOperand(0),		N1.getOperand(2).getOperand(0),
N1.getOperand(2).getOperand(1),		N1.getOperand(2).getOperand(1),
N0, Flags), Flags);		N0, Flags), Flags);
}		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
// fold (fadd (fma x, y, (fpext (fmul u, v))), z)		// fold (fadd (fma x, y, (fpext (fmul u, v))), z)
// -> (fma x, y, (fma (fpext u), (fpext v), z))		// -> (fma x, y, (fma (fpext u), (fpext v), z))
auto FoldFAddFMAFPExtFMul = [&] (		auto FoldFAddFMAFPExtFMul = [&] (
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto FoldFAddFMAFPExtFMul = [&] ( - SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z, - SDNodeFlags Flags) { - return matcher.getNode(PreferredFusedOpcode, SL, VT, X, Y, - matcher.getNode(PreferredFusedOpcode, SL, VT, - matcher.getNode(ISD::FP_EXTEND, SL, VT, U), - matcher.getNode(ISD::FP_EXTEND, SL, VT, V), - Z, Flags), Flags); + auto FoldFAddFMAFPExtFMul = [&](SDValue X, SDValue Y, SDValue U, SDValue V, + SDValue Z, SDNodeFlags Flags) { + return matcher.getNode( + PreferredFusedOpcode, SL, VT, X, Y, + matcher.getNode(PreferredFusedOpcode, SL, VT, + matcher.getNode(ISD::FP_EXTEND, SL, VT, U), + matcher.getNode(ISD::FP_EXTEND, SL, VT, V), Z, Flags), + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto FoldFAddFMAFPExtFMul = [&]…
SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z,		SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z,
SDNodeFlags Flags) {		SDNodeFlags Flags) {
return DAG.getNode(PreferredFusedOpcode, SL, VT, X, Y,		return matcher.getNode(PreferredFusedOpcode, SL, VT, X, Y,
DAG.getNode(PreferredFusedOpcode, SL, VT,		matcher.getNode(PreferredFusedOpcode, SL, VT,
DAG.getNode(ISD::FP_EXTEND, SL, VT, U),		matcher.getNode(ISD::FP_EXTEND, SL, VT, U),
DAG.getNode(ISD::FP_EXTEND, SL, VT, V),		matcher.getNode(ISD::FP_EXTEND, SL, VT, V),
Z, Flags), Flags);		Z, Flags), Flags);
};		};
if (N0.getOpcode() == PreferredFusedOpcode) {		if (matcher.match(N0, PreferredFusedOpcode)) {
SDValue N02 = N0.getOperand(2);		SDValue N02 = N0.getOperand(2);
if (N02.getOpcode() == ISD::FP_EXTEND) {		if (matcher.match(N02, ISD::FP_EXTEND)) {
SDValue N020 = N02.getOperand(0);		SDValue N020 = N02.getOperand(0);
if (isContractableFMUL(N020) &&		if (isContractableFMUL(N020) &&
TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,		TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,
N020.getValueType())) {		N020.getValueType())) {
return FoldFAddFMAFPExtFMul(N0.getOperand(0), N0.getOperand(1),		return FoldFAddFMAFPExtFMul(N0.getOperand(0), N0.getOperand(1),
N020.getOperand(0), N020.getOperand(1),		N020.getOperand(0), N020.getOperand(1),
N1, Flags);		N1, Flags);
}		}
}		}
}		}

// fold (fadd (fpext (fma x, y, (fmul u, v))), z)		// fold (fadd (fpext (fma x, y, (fmul u, v))), z)
// -> (fma (fpext x), (fpext y), (fma (fpext u), (fpext v), z))		// -> (fma (fpext x), (fpext y), (fma (fpext u), (fpext v), z))
// FIXME: This turns two single-precision and one double-precision		// FIXME: This turns two single-precision and one double-precision
// operation into two double-precision operations, which might not be		// operation into two double-precision operations, which might not be
// interesting for all targets, especially GPUs.		// interesting for all targets, especially GPUs.
auto FoldFAddFPExtFMAFMul = [&] (		auto FoldFAddFPExtFMAFMul = [&] (
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto FoldFAddFPExtFMAFMul = [&] ( - SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z, - SDNodeFlags Flags) { - return matcher.getNode(PreferredFusedOpcode, SL, VT, - matcher.getNode(ISD::FP_EXTEND, SL, VT, X), - matcher.getNode(ISD::FP_EXTEND, SL, VT, Y), - matcher.getNode(PreferredFusedOpcode, SL, VT, - matcher.getNode(ISD::FP_EXTEND, SL, VT, U), - matcher.getNode(ISD::FP_EXTEND, SL, VT, V), - Z, Flags), Flags); + auto FoldFAddFPExtFMAFMul = [&](SDValue X, SDValue Y, SDValue U, SDValue V, + SDValue Z, SDNodeFlags Flags) { + return matcher.getNode( + PreferredFusedOpcode, SL, VT, + matcher.getNode(ISD::FP_EXTEND, SL, VT, X), + matcher.getNode(ISD::FP_EXTEND, SL, VT, Y), + matcher.getNode(PreferredFusedOpcode, SL, VT, + matcher.getNode(ISD::FP_EXTEND, SL, VT, U), + matcher.getNode(ISD::FP_EXTEND, SL, VT, V), Z, Flags), + Flags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto FoldFAddFPExtFMAFMul = [&]…
SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z,		SDValue X, SDValue Y, SDValue U, SDValue V, SDValue Z,
SDNodeFlags Flags) {		SDNodeFlags Flags) {
return DAG.getNode(PreferredFusedOpcode, SL, VT,		return matcher.getNode(PreferredFusedOpcode, SL, VT,
DAG.getNode(ISD::FP_EXTEND, SL, VT, X),		matcher.getNode(ISD::FP_EXTEND, SL, VT, X),
DAG.getNode(ISD::FP_EXTEND, SL, VT, Y),		matcher.getNode(ISD::FP_EXTEND, SL, VT, Y),
DAG.getNode(PreferredFusedOpcode, SL, VT,		matcher.getNode(PreferredFusedOpcode, SL, VT,
DAG.getNode(ISD::FP_EXTEND, SL, VT, U),		matcher.getNode(ISD::FP_EXTEND, SL, VT, U),
DAG.getNode(ISD::FP_EXTEND, SL, VT, V),		matcher.getNode(ISD::FP_EXTEND, SL, VT, V),
Z, Flags), Flags);		Z, Flags), Flags);
};		};
if (N0.getOpcode() == ISD::FP_EXTEND) {		if (N0.getOpcode() == ISD::FP_EXTEND) {
SDValue N00 = N0.getOperand(0);		SDValue N00 = N0.getOperand(0);
if (N00.getOpcode() == PreferredFusedOpcode) {		if (N00.getOpcode() == PreferredFusedOpcode) {
SDValue N002 = N00.getOperand(2);		SDValue N002 = N00.getOperand(2);
if (isContractableFMUL(N002) &&		if (isContractableFMUL(N002) &&
TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,		TLI.isFPExtFoldable(DAG, PreferredFusedOpcode, VT,
▲ Show 20 Lines • Show All 434 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitFMULForFMADistributiveCombine(SDNode *N) {
if (SDValue FMA = FuseFSUB(N0, N1, Flags))		if (SDValue FMA = FuseFSUB(N0, N1, Flags))
return FMA;		return FMA;
if (SDValue FMA = FuseFSUB(N1, N0, Flags))		if (SDValue FMA = FuseFSUB(N1, N0, Flags))
return FMA;		return FMA;

return SDValue();		return SDValue();
}		}

		SDValue DAGCombiner::visitFADD_VP(SDNode *N) {
		// FADD -> FMA combines:
		if (SDValue Fused = visitFADDForFMACombine<VPMatchContext>(N)) {
		AddToWorklist(Fused.getNode());
		return Fused;
		}
		return SDValue();
		}

SDValue DAGCombiner::visitFADD(SDNode *N) {		SDValue DAGCombiner::visitFADD(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
bool N0CFP = isConstantFPBuildVectorOrConstantFP(N0);		bool N0CFP = isConstantFPBuildVectorOrConstantFP(N0);
bool N1CFP = isConstantFPBuildVectorOrConstantFP(N1);		bool N1CFP = isConstantFPBuildVectorOrConstantFP(N1);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDLoc DL(N);		SDLoc DL(N);
const TargetOptions &Options = DAG.getTarget().Options;		const TargetOptions &Options = DAG.getTarget().Options;
▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	if (TLI.isOperationLegalOrCustom(ISD::FMUL, VT) && !N0CFP && !N1CFP) {
N0.getOperand(0) == N1.getOperand(0)) {		N0.getOperand(0) == N1.getOperand(0)) {
return DAG.getNode(ISD::FMUL, DL, VT, N0.getOperand(0),		return DAG.getNode(ISD::FMUL, DL, VT, N0.getOperand(0),
DAG.getConstantFP(4.0, DL, VT), Flags);		DAG.getConstantFP(4.0, DL, VT), Flags);
}		}
}		}
} // enable-unsafe-fp-math		} // enable-unsafe-fp-math

// FADD -> FMA combines:		// FADD -> FMA combines:
if (SDValue Fused = visitFADDForFMACombine(N)) {		if (SDValue Fused = visitFADDForFMACombine<EmptyMatchContext>(N)) {
AddToWorklist(Fused.getNode());		AddToWorklist(Fused.getNode());
return Fused;		return Fused;
}		}
return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::visitFSUB(SDNode *N) {		SDValue DAGCombiner::visitFSUB(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
▲ Show 20 Lines • Show All 9,076 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FP_TO_UINT:		case ISD::STRICT_FP_TO_UINT:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;

case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;		case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;

case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;		case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;

		case ISD::VP_REDUCE_MUL:
		case ISD::VP_REDUCE_ADD:
		case ISD::VP_REDUCE_AND:
		case ISD::VP_REDUCE_XOR:
		case ISD::VP_REDUCE_OR:
		Res = PromoteIntRes_VP_REDUCE_nostart(N); break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Res = PromoteIntRes_VP_REDUCE_nostart(N); break; + Res = PromoteIntRes_VP_REDUCE_nostart(N); + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - Res =…

case ISD::AND:		case ISD::AND:
case ISD::OR:		case ISD::OR:
case ISD::XOR:		case ISD::XOR:
case ISD::ADD:		case ISD::ADD:
case ISD::SUB:		case ISD::SUB:
case ISD::MUL: Res = PromoteIntRes_SimpleIntBinOp(N); break;		case ISD::MUL: Res = PromoteIntRes_SimpleIntBinOp(N); break;

case ISD::SDIV:		case ISD::SDIV:
▲ Show 20 Lines • Show All 1,006 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_UADDSUBO(SDNode *N, unsigned ResNo) {

// Use the calculated overflow everywhere.		// Use the calculated overflow everywhere.
ReplaceValueWith(SDValue(N, 1), Ofl);		ReplaceValueWith(SDValue(N, 1), Ofl);

return Res;		return Res;
}		}

// Handle promotion for the ADDE/SUBE/ADDCARRY/SUBCARRY nodes. Notice that		// Handle promotion for the ADDE/SUBE/ADDCARRY/SUBCARRY nodes. Notice that
// the third operand of ADDE/SUBE nodes is carry flag, which differs from		// the third operand of ADDE/SUBE nodes is carry flag, which differs from
// the ADDCARRY/SUBCARRY nodes in that the third operand is carry Boolean.		// the ADDCARRY/SUBCARRY nodes in that the third operand is carry Boolean.
SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo) {		SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo) {
if (ResNo == 1)		if (ResNo == 1)
return PromoteIntRes_Overflow(N);		return PromoteIntRes_Overflow(N);

// We need to sign-extend the operands so the carry value computed by the		// We need to sign-extend the operands so the carry value computed by the
// wide operation will be equivalent to the carry value computed by the		// wide operation will be equivalent to the carry value computed by the
// narrow operation.		// narrow operation.
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
LLVM_DEBUG(dbgs() << "Promote integer operand: "; N->dump(&DAG);		LLVM_DEBUG(dbgs() << "Promote integer operand: "; N->dump(&DAG);
dbgs() << "\n");		dbgs() << "\n");
SDValue Res = SDValue();		SDValue Res = SDValue();
if (CustomLowerNode(N, N->getOperand(OpNo).getValueType(), false)) {		if (CustomLowerNode(N, N->getOperand(OpNo).getValueType(), false)) {
LLVM_DEBUG(dbgs() << "Node has been custom lowered, done\n");		LLVM_DEBUG(dbgs() << "Node has been custom lowered, done\n");
return false;		return false;
}		}

		if (N->isVP()) {
		Res = PromoteIntOp_VP(N, OpNo);
		} else {
switch (N->getOpcode()) {		switch (N->getOpcode()) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - switch (N->getOpcode()) { + switch (N->getOpcode()) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - switch (N->getOpcode()) { + switch (N…
default:		default:
#ifndef NDEBUG		#ifndef NDEBUG
dbgs() << "PromoteIntegerOperand Op #" << OpNo << ": ";		dbgs() << "PromoteIntegerOperand Op #" << OpNo << ": ";
N->dump(&DAG); dbgs() << "\n";		N->dump(&DAG); dbgs() << "\n";
#endif		#endif
llvm_unreachable("Do not know how to promote this operator's operand!");		llvm_unreachable("Do not know how to promote this operator's operand!");

case ISD::ANY_EXTEND: Res = PromoteIntOp_ANY_EXTEND(N); break;		case ISD::ANY_EXTEND: Res = PromoteIntOp_ANY_EXTEND(N); break;
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
case ISD::VECREDUCE_SMIN:		case ISD::VECREDUCE_SMIN:
case ISD::VECREDUCE_UMAX:		case ISD::VECREDUCE_UMAX:
case ISD::VECREDUCE_UMIN: Res = PromoteIntOp_VECREDUCE(N); break;		case ISD::VECREDUCE_UMIN: Res = PromoteIntOp_VECREDUCE(N); break;
}		}
		}

// If the result is null, the sub-method took care of registering results etc.		// If the result is null, the sub-method took care of registering results etc.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
if (Res.getNode() == N)		if (Res.getNode() == N)
return true;		return true;
▲ Show 20 Lines • Show All 277 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntOp_MSTORE(MaskedStoreSDNode *N,
}		}

return DAG.getMaskedStore(N->getChain(), dl, DataOp, N->getBasePtr(),		return DAG.getMaskedStore(N->getChain(), dl, DataOp, N->getBasePtr(),
N->getOffset(), Mask, N->getMemoryVT(),		N->getOffset(), Mask, N->getMemoryVT(),
N->getMemOperand(), N->getAddressingMode(),		N->getMemOperand(), N->getAddressingMode(),
TruncateStore, N->isCompressingStore());		TruncateStore, N->isCompressingStore());
}		}

		SDValue DAGTypeLegalizer::PromoteIntOp_VP(SDNode *N, unsigned OpNo) {
		EVT DataVT;
		switch (N->getOpcode()) {
		default:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: - DataVT = N->getValueType(0); + default: + DataVT = N->getValueType(0); Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: - DataVT = N->getValueType(0); +…
		DataVT = N->getValueType(0);
		break;

		case ISD::VP_STORE:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case ISD::VP_STORE: - case ISD::VP_SCATTER: - llvm_unreachable("TODO implement VP memory nodes"); + case ISD::VP_STORE: + case ISD::VP_SCATTER: + llvm_unreachable("TODO implement VP memory nodes"); Lint: Pre-merge checks: clang-format: please reformat the code ``` - case ISD::VP_STORE: - case ISD::VP_SCATTER…
		case ISD::VP_SCATTER:
		llvm_unreachable("TODO implement VP memory nodes");
		}

		// TODO assert that \p OpNo is the mask
		SDValue Mask = PromoteTargetBoolean(N->getOperand(OpNo), DataVT);
		SmallVector<SDValue, 4> NewOps(N->op_begin(), N->op_end());
		NewOps[OpNo] = Mask;
		return SDValue(DAG.UpdateNodeOperands(N, NewOps), 0);
		}

SDValue DAGTypeLegalizer::PromoteIntOp_MLOAD(MaskedLoadSDNode *N,		SDValue DAGTypeLegalizer::PromoteIntOp_MLOAD(MaskedLoadSDNode *N,
unsigned OpNo) {		unsigned OpNo) {
assert(OpNo == 3 && "Only know how to promote the mask!");		assert(OpNo == 3 && "Only know how to promote the mask!");
EVT DataVT = N->getValueType(0);		EVT DataVT = N->getValueType(0);
SDValue Mask = PromoteTargetBoolean(N->getOperand(OpNo), DataVT);		SDValue Mask = PromoteTargetBoolean(N->getOperand(OpNo), DataVT);
SmallVector<SDValue, 4> NewOps(N->op_begin(), N->op_end());		SmallVector<SDValue, 4> NewOps(N->op_begin(), N->op_end());
NewOps[OpNo] = Mask;		NewOps[OpNo] = Mask;
return SDValue(DAG.UpdateNodeOperands(N, NewOps), 0);		return SDValue(DAG.UpdateNodeOperands(N, NewOps), 0);
▲ Show 20 Lines • Show All 2,674 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_SCALAR_TO_VECTOR(SDNode *N) {
assert(NOutVT.isVector() && "This type must be promoted to a vector type");		assert(NOutVT.isVector() && "This type must be promoted to a vector type");
EVT NOutVTElem = NOutVT.getVectorElementType();		EVT NOutVTElem = NOutVT.getVectorElementType();

SDValue Op = DAG.getNode(ISD::ANY_EXTEND, dl, NOutVTElem, N->getOperand(0));		SDValue Op = DAG.getNode(ISD::ANY_EXTEND, dl, NOutVTElem, N->getOperand(0));

return DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, NOutVT, Op);		return DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, NOutVT, Op);
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_VP_REDUCE_nostart(SDNode *N) {
		SDLoc dl(N);

		SDValue VecVal = N->getOperand(0);
		SDValue MaskVal = N->getOperand(1);
		SDValue LenVal = N->getOperand(2);

		EVT VecVT = VecVal.getValueType();

		assert(VecVal.getValueType().isVector() && "Input must be a vector");
		assert(MaskVal.getValueType().isVector() && "Mask must be a vector");
		assert(!LenVal.getValueType().isVector() && "Vector length must be a scalar");

		EVT OutVT = N->getValueType(0);
		EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
		assert(NOutVT.isScalarInteger() && "Type must be promoted to a scalar integer type");
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - assert(NOutVT.isScalarInteger() && "Type must be promoted to a scalar integer type"); + assert(NOutVT.isScalarInteger() && + "Type must be promoted to a scalar integer type"); Lint: Pre-merge checks: clang-format: please reformat the code ``` - assert(NOutVT.isScalarInteger() && "Type must be…
		EVT NVecVT = TLI.getTypeToTransformTo(*DAG.getContext(), VecVT);
		// EVT NVecVT = EVT::getVectorVT(*DAG.getContext(), NOutVT, VecVT.getVectorNumElements(), VecVT.isScalableVector());
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // EVT NVecVT = EVT::getVectorVT(DAG.getContext(), NOutVT, VecVT.getVectorNumElements(), VecVT.isScalableVector()); + // EVT NVecVT = EVT::getVectorVT(DAG.getContext(), NOutVT, + // VecVT.getVectorNumElements(), VecVT.isScalableVector()); Lint: Pre-merge checks: clang-format: please reformat the code ``` - // EVT NVecVT = EVT::getVectorVT(*DAG.getContext…

		// extend operand along with result type
		SDValue ExtVecVal = (NVecVT == VecVT) ? VecVal : DAG.getNode(ISD::ANY_EXTEND, dl, NVecVT, VecVal);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue ExtVecVal = (NVecVT == VecVT) ? VecVal : DAG.getNode(ISD::ANY_EXTEND, dl, NVecVT, VecVal); + SDValue ExtVecVal = (NVecVT == VecVT) + ? VecVal + : DAG.getNode(ISD::ANY_EXTEND, dl, NVecVT, VecVal); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue ExtVecVal = (NVecVT == VecVT) ? VecVal…

		return DAG.getNode(N->getOpcode(), dl, NOutVT, {ExtVecVal, MaskVal, LenVal});
		}

SDValue DAGTypeLegalizer::PromoteIntRes_SPLAT_VECTOR(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_SPLAT_VECTOR(SDNode *N) {
SDLoc dl(N);		SDLoc dl(N);

SDValue SplatVal = N->getOperand(0);		SDValue SplatVal = N->getOperand(0);

assert(!SplatVal.getValueType().isVector() && "Input must be a scalar");		assert(!SplatVal.getValueType().isVector() && "Input must be a scalar");

EVT OutVT = N->getValueType(0);		EVT OutVT = N->getValueType(0);
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);		SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);
SDValue PromoteIntRes_MULFIX(SDNode *N);		SDValue PromoteIntRes_MULFIX(SDNode *N);
SDValue PromoteIntRes_DIVFIX(SDNode *N);		SDValue PromoteIntRes_DIVFIX(SDNode *N);
SDValue PromoteIntRes_FLT_ROUNDS(SDNode *N);		SDValue PromoteIntRes_FLT_ROUNDS(SDNode *N);
SDValue PromoteIntRes_VECREDUCE(SDNode *N);		SDValue PromoteIntRes_VECREDUCE(SDNode *N);
SDValue PromoteIntRes_ABS(SDNode *N);		SDValue PromoteIntRes_ABS(SDNode *N);

		// vp reduction without start value
		SDValue PromoteIntRes_VP_REDUCE_nostart(SDNode *N);

// Integer Operand Promotion.		// Integer Operand Promotion.
bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);		bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);		SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);
SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);		SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);
SDValue PromoteIntOp_BITCAST(SDNode *N);		SDValue PromoteIntOp_BITCAST(SDNode *N);
SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);		SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);
SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_BRCOND(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_BRCOND(SDNode *N, unsigned OpNo);
Show All 21 Lines	private:
SDValue PromoteIntOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);		SDValue PromoteIntOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);
SDValue PromoteIntOp_MGATHER(MaskedGatherSDNode *N, unsigned OpNo);		SDValue PromoteIntOp_MGATHER(MaskedGatherSDNode *N, unsigned OpNo);
SDValue PromoteIntOp_ADDSUBCARRY(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_ADDSUBCARRY(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_FRAMERETURNADDR(SDNode *N);		SDValue PromoteIntOp_FRAMERETURNADDR(SDNode *N);
SDValue PromoteIntOp_PREFETCH(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_PREFETCH(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_FIX(SDNode *N);		SDValue PromoteIntOp_FIX(SDNode *N);
SDValue PromoteIntOp_FPOWI(SDNode *N);		SDValue PromoteIntOp_FPOWI(SDNode *N);
SDValue PromoteIntOp_VECREDUCE(SDNode *N);		SDValue PromoteIntOp_VECREDUCE(SDNode *N);
		SDValue PromoteIntOp_VP(SDNode *N, unsigned OpNo);

void PromoteSetCCOperands(SDValue &LHS,SDValue &RHS, ISD::CondCode Code);		void PromoteSetCCOperands(SDValue &LHS,SDValue &RHS, ISD::CondCode Code);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Integer Expansion Support: LegalizeIntegerTypes.cpp		// Integer Expansion Support: LegalizeIntegerTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Given a processed operand Op which was expanded into two integers of half		/// Given a processed operand Op which was expanded into two integers of half
▲ Show 20 Lines • Show All 623 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 438 Lines • ▼ Show 20 Lines	if (IsInteger) {
case ISD::SETOGT: Result = ISD::SETUGT ; break; // SETUGT & SETNE		case ISD::SETOGT: Result = ISD::SETUGT ; break; // SETUGT & SETNE
}		}
}		}

return Result;		return Result;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// SDNode VP Support
		//===----------------------------------------------------------------------===//

		int
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -int -ISD::GetMaskPosVP(unsigned OpCode) { +int ISD::GetMaskPosVP(unsigned OpCode) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -int -ISD::GetMaskPosVP(unsigned OpCode) { +int ISD…
		ISD::GetMaskPosVP(unsigned OpCode) {
		switch (OpCode) {
		default: return -1;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: return -1; - - case ISD::VP_FNEG: - return 1; - - case ISD::VP_ADD: - case ISD::VP_SUB: - case ISD::VP_MUL: - case ISD::VP_SDIV: - case ISD::VP_SREM: - case ISD::VP_UDIV: - case ISD::VP_UREM: - - case ISD::VP_AND: - case ISD::VP_OR: - case ISD::VP_XOR: - case ISD::VP_SHL: - case ISD::VP_SRA: - case ISD::VP_SRL: - - case ISD::VP_FADD: - case ISD::VP_FMUL: - case ISD::VP_FSUB: - case ISD::VP_FDIV: - case ISD::VP_FREM: - return 2; - - case ISD::VP_FMA: - case ISD::VP_SELECT: - return 3; - - case VP_REDUCE_ADD: - case VP_REDUCE_MUL: - case VP_REDUCE_AND: - case VP_REDUCE_OR: - case VP_REDUCE_XOR: - case VP_REDUCE_SMAX: - case VP_REDUCE_SMIN: - case VP_REDUCE_UMAX: - case VP_REDUCE_UMIN: - case VP_REDUCE_FMAX: - case VP_REDUCE_FMIN: - return 1; - - case VP_REDUCE_FADD: - case VP_REDUCE_FMUL: - return 2; + default: + return -1; + + case ISD::VP_FNEG: + return 1; + + case ISD::VP_ADD: + case ISD::VP_SUB: + case ISD::VP_MUL: + case ISD::VP_SDIV: + case ISD::VP_SREM: + case ISD::VP_UDIV: + case ISD::VP_UREM: + + case ISD::VP_AND: + case ISD::VP_OR: + case ISD::VP_XOR: + case ISD::VP_SHL: + case ISD::VP_SRA: + case ISD::VP_SRL: + + case ISD::VP_FADD: + case ISD::VP_FMUL: + case ISD::VP_FSUB: + case ISD::VP_FDIV: + case ISD::VP_FREM: + return 2; + + case ISD::VP_FMA: + case ISD::VP_SELECT: + return 3; + + case VP_REDUCE_ADD: + case VP_REDUCE_MUL: + case VP_REDUCE_AND: + case VP_REDUCE_OR: + case VP_REDUCE_XOR: + case VP_REDUCE_SMAX: + case VP_REDUCE_SMIN: + case VP_REDUCE_UMAX: + case VP_REDUCE_UMIN: + case VP_REDUCE_FMAX: + case VP_REDUCE_FMIN: + return 1; + + case VP_REDUCE_FADD: + case VP_REDUCE_FMUL: + return 2; Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: return -1; - - case ISD::VP_FNEG…

		case ISD::VP_FNEG:
		return 1;

		case ISD::VP_ADD:
		case ISD::VP_SUB:
		case ISD::VP_MUL:
		case ISD::VP_SDIV:
		case ISD::VP_SREM:
		case ISD::VP_UDIV:
		case ISD::VP_UREM:

		case ISD::VP_AND:
		case ISD::VP_OR:
		case ISD::VP_XOR:
		case ISD::VP_SHL:
		case ISD::VP_SRA:
		case ISD::VP_SRL:

		case ISD::VP_FADD:
		case ISD::VP_FMUL:
		case ISD::VP_FSUB:
		case ISD::VP_FDIV:
		case ISD::VP_FREM:
		return 2;

		case ISD::VP_FMA:
		case ISD::VP_SELECT:
		return 3;

		case VP_REDUCE_ADD:
		case VP_REDUCE_MUL:
		case VP_REDUCE_AND:
		case VP_REDUCE_OR:
		case VP_REDUCE_XOR:
		case VP_REDUCE_SMAX:
		case VP_REDUCE_SMIN:
		case VP_REDUCE_UMAX:
		case VP_REDUCE_UMIN:
		case VP_REDUCE_FMAX:
		case VP_REDUCE_FMIN:
		return 1;

		case VP_REDUCE_FADD:
		case VP_REDUCE_FMUL:
		return 2;

		/// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants.
		// (implicit) case ISD::VP_COMPOSE: return -1
		}
		}

		int
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -int -ISD::GetVectorLengthPosVP(unsigned OpCode) { +int ISD::GetVectorLengthPosVP(unsigned OpCode) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -int -ISD::GetVectorLengthPosVP(unsigned OpCode) {…
		ISD::GetVectorLengthPosVP(unsigned OpCode) {
		switch (OpCode) {
		default: return -1;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: return -1; - - case VP_SELECT: - return 0; - - case VP_FNEG: - return 2; - - case VP_ADD: - case VP_SUB: - case VP_MUL: - case VP_SDIV: - case VP_SREM: - case VP_UDIV: - case VP_UREM: - - case VP_AND: - case VP_OR: - case VP_XOR: - case VP_SHL: - case VP_SRA: - case VP_SRL: - - case VP_FADD: - case VP_FMUL: - case VP_FDIV: - case VP_FREM: - return 3; - - case VP_FMA: - return 4; - - case VP_COMPOSE: - return 3; - - case VP_REDUCE_ADD: - case VP_REDUCE_MUL: - case VP_REDUCE_AND: - case VP_REDUCE_OR: - case VP_REDUCE_XOR: - case VP_REDUCE_SMAX: - case VP_REDUCE_SMIN: - case VP_REDUCE_UMAX: - case VP_REDUCE_UMIN: - case VP_REDUCE_FMAX: - case VP_REDUCE_FMIN: - return 2; - - case VP_REDUCE_FADD: - case VP_REDUCE_FMUL: - return 3; - - } -} - -unsigned -ISD::GetFunctionOpCodeForVP(unsigned OpCode, bool hasFPExcept) { + default: + return -1; + + case VP_SELECT: + return 0; + + case VP_FNEG: + return 2; + + case VP_ADD: + case VP_SUB: + case VP_MUL: + case VP_SDIV: + case VP_SREM: + case VP_UDIV: + case VP_UREM: + + case VP_AND: + case VP_OR: + case VP_XOR: + case VP_SHL: + case VP_SRA: + case VP_SRL: + + case VP_FADD: + case VP_FMUL: + case VP_FDIV: + case VP_FREM: + return 3; + + case VP_FMA: + return 4; + + case VP_COMPOSE: + return 3; + + case VP_REDUCE_ADD: + case VP_REDUCE_MUL: + case VP_REDUCE_AND: + case VP_REDUCE_OR: + case VP_REDUCE_XOR: + case VP_REDUCE_SMAX: + case VP_REDUCE_SMIN: + case VP_REDUCE_UMAX: + case VP_REDUCE_UMIN: + case VP_REDUCE_FMAX: + case VP_REDUCE_FMIN: + return 2; + + case VP_REDUCE_FADD: + case VP_REDUCE_FMUL: + return 3; + } +} + +unsigned ISD::GetFunctionOpCodeForVP(unsigned OpCode, bool hasFPExcept) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: return -1; - - case VP_SELECT…

		case VP_SELECT:
		return 0;

		case VP_FNEG:
		return 2;

		case VP_ADD:
		case VP_SUB:
		case VP_MUL:
		case VP_SDIV:
		case VP_SREM:
		case VP_UDIV:
		case VP_UREM:

		case VP_AND:
		case VP_OR:
		case VP_XOR:
		case VP_SHL:
		case VP_SRA:
		case VP_SRL:

		case VP_FADD:
		case VP_FMUL:
		case VP_FDIV:
		case VP_FREM:
		return 3;

		case VP_FMA:
		return 4;

		case VP_COMPOSE:
		return 3;

		case VP_REDUCE_ADD:
		case VP_REDUCE_MUL:
		case VP_REDUCE_AND:
		case VP_REDUCE_OR:
		case VP_REDUCE_XOR:
		case VP_REDUCE_SMAX:
		case VP_REDUCE_SMIN:
		case VP_REDUCE_UMAX:
		case VP_REDUCE_UMIN:
		case VP_REDUCE_FMAX:
		case VP_REDUCE_FMIN:
		return 2;

		case VP_REDUCE_FADD:
		case VP_REDUCE_FMUL:
		return 3;

		}
		}

		unsigned
		ISD::GetFunctionOpCodeForVP(unsigned OpCode, bool hasFPExcept) {
		switch (OpCode) {
		default: return OpCode;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: return OpCode; - - case VP_SELECT: return ISD::VSELECT; - case VP_ADD: return ISD::ADD; - case VP_SUB: return ISD::SUB; - case VP_MUL: return ISD::MUL; - case VP_SDIV: return ISD::SDIV; - case VP_SREM: return ISD::SREM; - case VP_UDIV: return ISD::UDIV; - case VP_UREM: return ISD::UREM; - - case VP_AND: return ISD::AND; - case VP_OR: return ISD::OR; - case VP_XOR: return ISD::XOR; - case VP_SHL: return ISD::SHL; - case VP_SRA: return ISD::SRA; - case VP_SRL: return ISD::SRL; - - case VP_FNEG: return ISD::FNEG; - case VP_FADD: return hasFPExcept ? ISD::STRICT_FADD : ISD::FADD; - case VP_FSUB: return hasFPExcept ? ISD::STRICT_FSUB : ISD::FSUB; - case VP_FMUL: return hasFPExcept ? ISD::STRICT_FMUL : ISD::FMUL; - case VP_FDIV: return hasFPExcept ? ISD::STRICT_FDIV : ISD::FDIV; - case VP_FREM: return hasFPExcept ? ISD::STRICT_FREM : ISD::FREM; - - case VP_REDUCE_AND: return VECREDUCE_AND; - case VP_REDUCE_OR: return VECREDUCE_OR; - case VP_REDUCE_XOR: return VECREDUCE_XOR; - case VP_REDUCE_ADD: return VECREDUCE_ADD; - case VP_REDUCE_FADD: return VECREDUCE_FADD; - case VP_REDUCE_FMUL: return VECREDUCE_FMUL; - case VP_REDUCE_FMAX: return VECREDUCE_FMAX; - case VP_REDUCE_FMIN: return VECREDUCE_FMIN; - case VP_REDUCE_UMAX: return VECREDUCE_UMAX; - case VP_REDUCE_UMIN: return VECREDUCE_UMIN; - case VP_REDUCE_SMAX: return VECREDUCE_SMAX; - case VP_REDUCE_SMIN: return VECREDUCE_SMIN; - - case VP_STORE: return ISD::MSTORE; - case VP_LOAD: return ISD::MLOAD; - case VP_GATHER: return ISD::MGATHER; - case VP_SCATTER: return ISD::MSCATTER; - - case VP_FMA: return hasFPExcept ? ISD::STRICT_FMA : ISD::FMA; - } -} - -unsigned -ISD::GetVPForFunctionOpCode(unsigned OpCode) { + default: + return OpCode; + + case VP_SELECT: + return ISD::VSELECT; + case VP_ADD: + return ISD::ADD; + case VP_SUB: + return ISD::SUB; + case VP_MUL: + return ISD::MUL; + case VP_SDIV: + return ISD::SDIV; + case VP_SREM: + return ISD::SREM; + case VP_UDIV: + return ISD::UDIV; + case VP_UREM: + return ISD::UREM; + + case VP_AND: + return ISD::AND; + case VP_OR: + return ISD::OR; + case VP_XOR: + return ISD::XOR; + case VP_SHL: + return ISD::SHL; + case VP_SRA: + return ISD::SRA; + case VP_SRL: + return ISD::SRL; + + case VP_FNEG: + return ISD::FNEG; + case VP_FADD: + return hasFPExcept ? ISD::STRICT_FADD : ISD::FADD; + case VP_FSUB: + return hasFPExcept ? ISD::STRICT_FSUB : ISD::FSUB; + case VP_FMUL: + return hasFPExcept ? ISD::STRICT_FMUL : ISD::FMUL; + case VP_FDIV: + return hasFPExcept ? ISD::STRICT_FDIV : ISD::FDIV; + case VP_FREM: + return hasFPExcept ? ISD::STRICT_FREM : ISD::FREM; + + case VP_REDUCE_AND: + return VECREDUCE_AND; + case VP_REDUCE_OR: + return VECREDUCE_OR; + case VP_REDUCE_XOR: + return VECREDUCE_XOR; + case VP_REDUCE_ADD: + return VECREDUCE_ADD; + case VP_REDUCE_FADD: + return VECREDUCE_FADD; + case VP_REDUCE_FMUL: + return VECREDUCE_FMUL; + case VP_REDUCE_FMAX: + return VECREDUCE_FMAX; + case VP_REDUCE_FMIN: + return VECREDUCE_FMIN; + case VP_REDUCE_UMAX: + return VECREDUCE_UMAX; + case VP_REDUCE_UMIN: + return VECREDUCE_UMIN; + case VP_REDUCE_SMAX: + return VECREDUCE_SMAX; + case VP_REDUCE_SMIN: + return VECREDUCE_SMIN; + + case VP_STORE: + return ISD::MSTORE; + case VP_LOAD: + return ISD::MLOAD; + case VP_GATHER: + return ISD::MGATHER; + case VP_SCATTER: + return ISD::MSCATTER; + + case VP_FMA: + return hasFPExcept ? ISD::STRICT_FMA : ISD::FMA; + } +} + +unsigned ISD::GetVPForFunctionOpCode(unsigned OpCode) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: return OpCode; - - case VP_SELECT…

		case VP_SELECT: return ISD::VSELECT;
		case VP_ADD: return ISD::ADD;
		case VP_SUB: return ISD::SUB;
		case VP_MUL: return ISD::MUL;
		case VP_SDIV: return ISD::SDIV;
		case VP_SREM: return ISD::SREM;
		case VP_UDIV: return ISD::UDIV;
		case VP_UREM: return ISD::UREM;

		case VP_AND: return ISD::AND;
		case VP_OR: return ISD::OR;
		case VP_XOR: return ISD::XOR;
		case VP_SHL: return ISD::SHL;
		case VP_SRA: return ISD::SRA;
		case VP_SRL: return ISD::SRL;

		case VP_FNEG: return ISD::FNEG;
		case VP_FADD: return hasFPExcept ? ISD::STRICT_FADD : ISD::FADD;
		case VP_FSUB: return hasFPExcept ? ISD::STRICT_FSUB : ISD::FSUB;
		case VP_FMUL: return hasFPExcept ? ISD::STRICT_FMUL : ISD::FMUL;
		case VP_FDIV: return hasFPExcept ? ISD::STRICT_FDIV : ISD::FDIV;
		case VP_FREM: return hasFPExcept ? ISD::STRICT_FREM : ISD::FREM;

		case VP_REDUCE_AND: return VECREDUCE_AND;
		case VP_REDUCE_OR: return VECREDUCE_OR;
		case VP_REDUCE_XOR: return VECREDUCE_XOR;
		case VP_REDUCE_ADD: return VECREDUCE_ADD;
		case VP_REDUCE_FADD: return VECREDUCE_FADD;
		case VP_REDUCE_FMUL: return VECREDUCE_FMUL;
		case VP_REDUCE_FMAX: return VECREDUCE_FMAX;
		case VP_REDUCE_FMIN: return VECREDUCE_FMIN;
		case VP_REDUCE_UMAX: return VECREDUCE_UMAX;
		case VP_REDUCE_UMIN: return VECREDUCE_UMIN;
		case VP_REDUCE_SMAX: return VECREDUCE_SMAX;
		case VP_REDUCE_SMIN: return VECREDUCE_SMIN;

		case VP_STORE: return ISD::MSTORE;
		case VP_LOAD: return ISD::MLOAD;
		case VP_GATHER: return ISD::MGATHER;
		case VP_SCATTER: return ISD::MSCATTER;

		case VP_FMA: return hasFPExcept ? ISD::STRICT_FMA : ISD::FMA;
		}
		}

		unsigned
		ISD::GetVPForFunctionOpCode(unsigned OpCode) {
		switch (OpCode) {
		default: llvm_unreachable("can not translate this Opcode to VP");
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: llvm_unreachable("can not translate this Opcode to VP"); - - case VSELECT: return ISD::VP_SELECT; - case ADD: return ISD::VP_ADD; - case SUB: return ISD::VP_SUB; - case MUL: return ISD::VP_MUL; - case SDIV: return ISD::VP_SDIV; - case SREM: return ISD::VP_SREM; - case UDIV: return ISD::VP_UDIV; - case UREM: return ISD::VP_UREM; - - case AND: return ISD::VP_AND; - case OR: return ISD::VP_OR; - case XOR: return ISD::VP_XOR; - case SHL: return ISD::VP_SHL; - case SRA: return ISD::VP_SRA; - case SRL: return ISD::VP_SRL; - - case FNEG: return ISD::VP_FNEG; - case STRICT_FADD: - case FADD: return ISD::VP_FADD; - case STRICT_FSUB: - case FSUB: return ISD::VP_FSUB; - case STRICT_FMUL: - case FMUL: return ISD::VP_FMUL; - case STRICT_FDIV: - case FDIV: return ISD::VP_FDIV; - case STRICT_FREM: - case FREM: return ISD::VP_FREM; - - case STRICT_FMA: - case FMA: return ISD::VP_FMA; + default: + llvm_unreachable("can not translate this Opcode to VP"); + + case VSELECT: + return ISD::VP_SELECT; + case ADD: + return ISD::VP_ADD; + case SUB: + return ISD::VP_SUB; + case MUL: + return ISD::VP_MUL; + case SDIV: + return ISD::VP_SDIV; + case SREM: + return ISD::VP_SREM; + case UDIV: + return ISD::VP_UDIV; + case UREM: + return ISD::VP_UREM; + + case AND: + return ISD::VP_AND; + case OR: + return ISD::VP_OR; + case XOR: + return ISD::VP_XOR; + case SHL: + return ISD::VP_SHL; + case SRA: + return ISD::VP_SRA; + case SRL: + return ISD::VP_SRL; + + case FNEG: + return ISD::VP_FNEG; + case STRICT_FADD: + case FADD: + return ISD::VP_FADD; + case STRICT_FSUB: + case FSUB: + return ISD::VP_FSUB; + case STRICT_FMUL: + case FMUL: + return ISD::VP_FMUL; + case STRICT_FDIV: + case FDIV: + return ISD::VP_FDIV; + case STRICT_FREM: + case FREM: + return ISD::VP_FREM; + + case STRICT_FMA: + case FMA: + return ISD::VP_FMA; Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: llvm_unreachable("can not translate…

		case VSELECT: return ISD::VP_SELECT;
		case ADD: return ISD::VP_ADD;
		case SUB: return ISD::VP_SUB;
		case MUL: return ISD::VP_MUL;
		case SDIV: return ISD::VP_SDIV;
		case SREM: return ISD::VP_SREM;
		case UDIV: return ISD::VP_UDIV;
		case UREM: return ISD::VP_UREM;

		case AND: return ISD::VP_AND;
		case OR: return ISD::VP_OR;
		case XOR: return ISD::VP_XOR;
		case SHL: return ISD::VP_SHL;
		case SRA: return ISD::VP_SRA;
		case SRL: return ISD::VP_SRL;

		case FNEG: return ISD::VP_FNEG;
		case STRICT_FADD:
		case FADD: return ISD::VP_FADD;
		case STRICT_FSUB:
		case FSUB: return ISD::VP_FSUB;
		case STRICT_FMUL:
		case FMUL: return ISD::VP_FMUL;
		case STRICT_FDIV:
		case FDIV: return ISD::VP_FDIV;
		case STRICT_FREM:
		case FREM: return ISD::VP_FREM;

		case STRICT_FMA:
		case FMA: return ISD::VP_FMA;
		}
		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
		//===----------------------------------------------------------------------===//
// SDNode Profile Support		// SDNode Profile Support
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// AddNodeIDOpcode - Add the node opcode to the NodeID data.		/// AddNodeIDOpcode - Add the node opcode to the NodeID data.
static void AddNodeIDOpcode(FoldingSetNodeID &ID, unsigned OpC) {		static void AddNodeIDOpcode(FoldingSetNodeID &ID, unsigned OpC) {
ID.AddInteger(OpC);		ID.AddInteger(OpC);
}		}

▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	static void AddNodeIDCustom(FoldingSetNodeID &ID, const SDNode *N) {
}		}
case ISD::STORE: {		case ISD::STORE: {
const StoreSDNode *ST = cast<StoreSDNode>(N);		const StoreSDNode *ST = cast<StoreSDNode>(N);
ID.AddInteger(ST->getMemoryVT().getRawBits());		ID.AddInteger(ST->getMemoryVT().getRawBits());
ID.AddInteger(ST->getRawSubclassData());		ID.AddInteger(ST->getRawSubclassData());
ID.AddInteger(ST->getPointerInfo().getAddrSpace());		ID.AddInteger(ST->getPointerInfo().getAddrSpace());
break;		break;
}		}
		case ISD::VP_LOAD: {
		const VPLoadSDNode *ELD = cast<VPLoadSDNode>(N);
		ID.AddInteger(ELD->getMemoryVT().getRawBits());
		ID.AddInteger(ELD->getRawSubclassData());
		ID.AddInteger(ELD->getPointerInfo().getAddrSpace());
		break;
		}
		case ISD::VP_STORE: {
		const VPStoreSDNode *EST = cast<VPStoreSDNode>(N);
		ID.AddInteger(EST->getMemoryVT().getRawBits());
		ID.AddInteger(EST->getRawSubclassData());
		ID.AddInteger(EST->getPointerInfo().getAddrSpace());
		break;
		}
		case ISD::VP_GATHER: {
		const VPGatherSDNode *EG = cast<VPGatherSDNode>(N);
		ID.AddInteger(EG->getMemoryVT().getRawBits());
		ID.AddInteger(EG->getRawSubclassData());
		ID.AddInteger(EG->getPointerInfo().getAddrSpace());
		break;
		}
		case ISD::VP_SCATTER: {
		const VPScatterSDNode *ES = cast<VPScatterSDNode>(N);
		ID.AddInteger(ES->getMemoryVT().getRawBits());
		ID.AddInteger(ES->getRawSubclassData());
		ID.AddInteger(ES->getPointerInfo().getAddrSpace());
		break;
		}
case ISD::MLOAD: {		case ISD::MLOAD: {
const MaskedLoadSDNode *MLD = cast<MaskedLoadSDNode>(N);		const MaskedLoadSDNode *MLD = cast<MaskedLoadSDNode>(N);
ID.AddInteger(MLD->getMemoryVT().getRawBits());		ID.AddInteger(MLD->getMemoryVT().getRawBits());
ID.AddInteger(MLD->getRawSubclassData());		ID.AddInteger(MLD->getRawSubclassData());
ID.AddInteger(MLD->getPointerInfo().getAddrSpace());		ID.AddInteger(MLD->getPointerInfo().getAddrSpace());
break;		break;
}		}
case ISD::MSTORE: {		case ISD::MSTORE: {
▲ Show 20 Lines • Show All 6,534 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getIndexedMaskedLoad(SDValue OrigLoad, const SDLoc &dl,
MaskedLoadSDNode *LD = cast<MaskedLoadSDNode>(OrigLoad);		MaskedLoadSDNode *LD = cast<MaskedLoadSDNode>(OrigLoad);
assert(LD->getOffset().isUndef() && "Masked load is already a indexed load!");		assert(LD->getOffset().isUndef() && "Masked load is already a indexed load!");
return getMaskedLoad(OrigLoad.getValueType(), dl, LD->getChain(), Base,		return getMaskedLoad(OrigLoad.getValueType(), dl, LD->getChain(), Base,
Offset, LD->getMask(), LD->getPassThru(),		Offset, LD->getMask(), LD->getPassThru(),
LD->getMemoryVT(), LD->getMemOperand(), AM,		LD->getMemoryVT(), LD->getMemOperand(), AM,
LD->getExtensionType(), LD->isExpandingLoad());		LD->getExtensionType(), LD->isExpandingLoad());
}		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
		SDValue SelectionDAG::getLoadVP(EVT VT, const SDLoc &dl, SDValue Chain,
		SDValue Ptr, SDValue Mask, SDValue VLen,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Ptr, SDValue Mask, SDValue VLen, - EVT MemVT, MachineMemOperand MMO, - ISD::LoadExtType ExtTy) { + SDValue Ptr, SDValue Mask, SDValue VLen, + EVT MemVT, MachineMemOperand MMO, + ISD::LoadExtType ExtTy) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Ptr…
		EVT MemVT, MachineMemOperand *MMO,
		ISD::LoadExtType ExtTy) {
		SDVTList VTs = getVTList(VT, MVT::Other);
		SDValue Ops[] = { Chain, Ptr, Mask, VLen };
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Ops[] = { Chain, Ptr, Mask, VLen }; + SDValue Ops[] = {Chain, Ptr, Mask, VLen}; Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Ops[] = { Chain, Ptr, Mask, VLen }; +…
		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::VP_LOAD, VTs, Ops);
		ID.AddInteger(VT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<VPLoadSDNode>(
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ID.AddInteger(getSyntheticNodeSubclassData<VPLoadSDNode>( - dl.getIROrder(), VTs, ExtTy, MemVT, MMO)); + ID.AddInteger(getSyntheticNodeSubclassData<VPLoadSDNode>(dl.getIROrder(), VTs, + ExtTy, MemVT, MMO)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ID.AddInteger…
		dl.getIROrder(), VTs, ExtTy, MemVT, MMO));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP)) {
		cast<VPLoadSDNode>(E)->refineAlignment(MMO);
		return SDValue(E, 0);
		}
		auto *N = newSDNode<VPLoadSDNode>(dl.getIROrder(), dl.getDebugLoc(), VTs,
		ExtTy, MemVT, MMO);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ExtTy, MemVT, MMO); + ExtTy, MemVT, MMO); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ExtTy…
		createOperands(N, Ops);

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - -SDValue SelectionDAG::getStoreVP(SDValue Chain, const SDLoc &dl, - SDValue Val, SDValue Ptr, SDValue Mask, - SDValue VLen, EVT MemVT, MachineMemOperand MMO, - bool IsTruncating) { - assert(Chain.getValueType() == MVT::Other && - "Invalid chain type"); +SDValue SelectionDAG::getStoreVP(SDValue Chain, const SDLoc &dl, SDValue Val, + SDValue Ptr, SDValue Mask, SDValue VLen, + EVT MemVT, MachineMemOperand MMO, + bool IsTruncating) { + assert(Chain.getValueType() == MVT::Other && "Invalid chain type"); Lint: Pre-merge checks: clang-format: please reformat the code ``` - -SDValue SelectionDAG::getStoreVP(SDValue Chain…
		SDValue SelectionDAG::getStoreVP(SDValue Chain, const SDLoc &dl,
		SDValue Val, SDValue Ptr, SDValue Mask,
		SDValue VLen, EVT MemVT, MachineMemOperand *MMO,
		bool IsTruncating) {
		assert(Chain.getValueType() == MVT::Other &&
		"Invalid chain type");
		SDVTList VTs = getVTList(MVT::Other);
		SDValue Ops[] = { Chain, Val, Ptr, Mask, VLen };
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Ops[] = { Chain, Val, Ptr, Mask, VLen }; + SDValue Ops[] = {Chain, Val, Ptr, Mask, VLen}; Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Ops[] = { Chain, Val, Ptr, Mask, VLen }…
		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::MSTORE, VTs, Ops);
		ID.AddInteger(MemVT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<VPStoreSDNode>(
		dl.getIROrder(), VTs, IsTruncating, MemVT, MMO));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP)) {
		cast<VPStoreSDNode>(E)->refineAlignment(MMO);
		return SDValue(E, 0);
		}
		auto *N = newSDNode<VPStoreSDNode>(dl.getIROrder(), dl.getDebugLoc(), VTs,
		IsTruncating, MemVT, MMO);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - IsTruncating, MemVT, MMO); + IsTruncating, MemVT, MMO); Lint: Pre-merge checks: clang-format: please reformat the code ``` - …
		createOperands(N, Ops);

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}

		SDValue SelectionDAG::getGatherVP(SDVTList VTs, EVT VT, const SDLoc &dl,
		ArrayRef<SDValue> Ops,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ArrayRef<SDValue> Ops, - MachineMemOperand MMO, - ISD::MemIndexType IndexType) { + ArrayRef<SDValue> Ops, MachineMemOperand MMO, + ISD::MemIndexType IndexType) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - …
		MachineMemOperand *MMO,
		ISD::MemIndexType IndexType) {
		assert(Ops.size() == 6 && "Incompatible number of operands");

		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::VP_GATHER, VTs, Ops);
		ID.AddInteger(VT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<VPGatherSDNode>(
		dl.getIROrder(), VTs, VT, MMO, IndexType));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP)) {
		cast<VPGatherSDNode>(E)->refineAlignment(MMO);
		return SDValue(E, 0);
		}

		auto *N = newSDNode<VPGatherSDNode>(dl.getIROrder(), dl.getDebugLoc(),
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto N = newSDNode<VPGatherSDNode>(dl.getIROrder(), dl.getDebugLoc(), - VTs, VT, MMO, IndexType); + auto N = newSDNode<VPGatherSDNode>(dl.getIROrder(), dl.getDebugLoc(), VTs, + VT, MMO, IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto *N = newSDNode<VPGatherSDNode>(dl.getIROrder…
		VTs, VT, MMO, IndexType);
		createOperands(N, Ops);

		assert(N->getMask().getValueType().getVectorNumElements() ==
		N->getValueType(0).getVectorNumElements() &&
		"Vector width mismatch between mask and data");
		assert(N->getIndex().getValueType().getVectorNumElements() >=
		N->getValueType(0).getVectorNumElements() &&
		"Vector width mismatch between index and data");
		assert(isa<ConstantSDNode>(N->getScale()) &&
		cast<ConstantSDNode>(N->getScale())->getAPIntValue().isPowerOf2() &&
		"Scale should be a constant power of 2");

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}

		SDValue SelectionDAG::getScatterVP(SDVTList VTs, EVT VT, const SDLoc &dl,
		ArrayRef<SDValue> Ops,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ArrayRef<SDValue> Ops, - MachineMemOperand MMO, - ISD::MemIndexType IndexType) { + ArrayRef<SDValue> Ops, + MachineMemOperand MMO, + ISD::MemIndexType IndexType) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - …
		MachineMemOperand *MMO,
		ISD::MemIndexType IndexType) {
		assert(Ops.size() == 7 && "Incompatible number of operands");

		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::VP_SCATTER, VTs, Ops);
		ID.AddInteger(VT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<VPScatterSDNode>(
		dl.getIROrder(), VTs, VT, MMO, IndexType));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP)) {
		cast<VPScatterSDNode>(E)->refineAlignment(MMO);
		return SDValue(E, 0);
		}
		auto *N = newSDNode<VPScatterSDNode>(dl.getIROrder(), dl.getDebugLoc(),
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto N = newSDNode<VPScatterSDNode>(dl.getIROrder(), dl.getDebugLoc(), - VTs, VT, MMO, IndexType); + auto N = newSDNode<VPScatterSDNode>(dl.getIROrder(), dl.getDebugLoc(), VTs, + VT, MMO, IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto *N = newSDNode<VPScatterSDNode>(dl.
		VTs, VT, MMO, IndexType);
		createOperands(N, Ops);

		assert(N->getMask().getValueType().getVectorNumElements() ==
		N->getValue().getValueType().getVectorNumElements() &&
		"Vector width mismatch between mask and data");
		assert(N->getIndex().getValueType().getVectorNumElements() >=
		N->getValue().getValueType().getVectorNumElements() &&
		"Vector width mismatch between index and data");
		assert(isa<ConstantSDNode>(N->getScale()) &&
		cast<ConstantSDNode>(N->getScale())->getAPIntValue().isPowerOf2() &&
		"Scale should be a constant power of 2");

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}

SDValue SelectionDAG::getMaskedStore(SDValue Chain, const SDLoc &dl,		SDValue SelectionDAG::getMaskedStore(SDValue Chain, const SDLoc &dl,
SDValue Val, SDValue Base, SDValue Offset,		SDValue Val, SDValue Base, SDValue Offset,
SDValue Mask, EVT MemVT,		SDValue Mask, EVT MemVT,
MachineMemOperand *MMO,		MachineMemOperand *MMO,
ISD::MemIndexedMode AM, bool IsTruncating,		ISD::MemIndexedMode AM, bool IsTruncating,
bool IsCompressing) {		bool IsCompressing) {
assert(Chain.getValueType() == MVT::Other &&		assert(Chain.getValueType() == MVT::Other &&
"Invalid chain type");		"Invalid chain type");
▲ Show 20 Lines • Show All 2,664 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

Show First 20 Lines • Show All 762 Lines • ▼ Show 20 Lines	private:
void visitLoadFromSwiftError(const LoadInst &I);		void visitLoadFromSwiftError(const LoadInst &I);
void visitStoreToSwiftError(const StoreInst &I);		void visitStoreToSwiftError(const StoreInst &I);
void visitFreeze(const FreezeInst &I);		void visitFreeze(const FreezeInst &I);

void visitInlineAsm(ImmutableCallSite CS);		void visitInlineAsm(ImmutableCallSite CS);
void visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);		void visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);
void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);		void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);
void visitConstrainedFPIntrinsic(const ConstrainedFPIntrinsic &FPI);		void visitConstrainedFPIntrinsic(const ConstrainedFPIntrinsic &FPI);
		void visitVectorPredicationIntrinsic(const VPIntrinsic &VPI);
		void visitCmpVP(const VPIntrinsic &I);
		void visitLoadVP(const CallInst &I);
		void visitStoreVP(const CallInst &I);
		void visitGatherVP(const CallInst &I);
		void visitScatterVP(const CallInst &I);

void visitVAStart(const CallInst &I);		void visitVAStart(const CallInst &I);
void visitVAArg(const VAArgInst &I);		void visitVAArg(const VAArgInst &I);
void visitVAEnd(const CallInst &I);		void visitVAEnd(const CallInst &I);
void visitVACopy(const CallInst &I);		void visitVACopy(const CallInst &I);
void visitStackmap(const CallInst &I);		void visitStackmap(const CallInst &I);
void visitPatchpoint(ImmutableCallSite CS,		void visitPatchpoint(ImmutableCallSite CS,
const BasicBlock *EHPadBB = nullptr);		const BasicBlock *EHPadBB = nullptr);
▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,353 Lines • ▼ Show 20 Lines	getMachineMemOperand(MachinePointerInfo(PtrOperand),
Alignment, AAInfo);		Alignment, AAInfo);
SDValue StoreNode =		SDValue StoreNode =
DAG.getMaskedStore(getMemoryRoot(), sdl, Src0, Ptr, Offset, Mask, VT, MMO,		DAG.getMaskedStore(getMemoryRoot(), sdl, Src0, Ptr, Offset, Mask, VT, MMO,
ISD::UNINDEXED, false /* Truncating */, IsCompressing);		ISD::UNINDEXED, false /* Truncating */, IsCompressing);
DAG.setRoot(StoreNode);		DAG.setRoot(StoreNode);
setValue(&I, StoreNode);		setValue(&I, StoreNode);
}		}

		void SelectionDAGBuilder::visitStoreVP(const CallInst &I) {
		SDLoc sdl = getCurSDLoc();

		auto getVPStoreOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto getVPStoreOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0, - Value * &VLen, unsigned & Alignment) { + auto getVPStoreOps = [&](Value &Ptr, Value &Mask, Value &Src0, + Value &VLen, unsigned &Alignment) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto getVPStoreOps = [&](Value* &Ptr, Value*…
		Value * &VLen, unsigned & Alignment) {
		// llvm.masked.store.*(Src0, Ptr, Mask, VLen)
		Src0 = I.getArgOperand(0);
		Ptr = I.getArgOperand(1);
		Alignment = I.getParamAlignment(1);
		Mask = I.getArgOperand(2);
		VLen = I.getArgOperand(3);
		};

		Value PtrOperand, MaskOperand, Src0Operand, VLenOperand;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value PtrOperand, MaskOperand, Src0Operand, VLenOperand; + Value PtrOperand, MaskOperand, Src0Operand, VLenOperand; Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value PtrOperand, MaskOperand, *Src0Operand…
		unsigned Alignment = 0;
		getVPStoreOps(PtrOperand, MaskOperand, Src0Operand, VLenOperand, Alignment);

		SDValue Ptr = getValue(PtrOperand);
		SDValue Src0 = getValue(Src0Operand);
		SDValue Mask = getValue(MaskOperand);
		SDValue VLen = getValue(VLenOperand);

		EVT VT = Src0.getValueType();
		if (!Alignment)
		Alignment = DAG.getEVTAlignment(VT);

		AAMDNodes AAInfo;
		I.getAAMetadata(AAInfo);

		MachineMemOperand *MMO =
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - MachineMemOperand MMO = - DAG.getMachineFunction(). - getMachineMemOperand(MachinePointerInfo(PtrOperand), - MachineMemOperand::MOStore, VT.getStoreSize(), - Alignment, AAInfo); + MachineMemOperand MMO = DAG.getMachineFunction().getMachineMemOperand( + MachinePointerInfo(PtrOperand), MachineMemOperand::MOStore, + VT.getStoreSize(), Alignment, AAInfo); Lint: Pre-merge checks: clang-format: please reformat the code ``` - MachineMemOperand *MMO = - DAG.
		DAG.getMachineFunction().
		getMachineMemOperand(MachinePointerInfo(PtrOperand),
		MachineMemOperand::MOStore, VT.getStoreSize(),
		Alignment, AAInfo);
		SDValue StoreNode = DAG.getStoreVP(getRoot(), sdl, Src0, Ptr, Mask, VLen, VT,
		MMO, false /* Truncating */);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - MMO, false /* Truncating /); + MMO, false / Truncating /); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - MMO, false…
		DAG.setRoot(StoreNode);
		setValue(&I, StoreNode);
		}

// Get a uniform base for the Gather/Scatter intrinsic.		// Get a uniform base for the Gather/Scatter intrinsic.
// The first argument of the Gather/Scatter intrinsic is a vector of pointers.		// The first argument of the Gather/Scatter intrinsic is a vector of pointers.
// We try to represent it as a base pointer + vector of indices.		// We try to represent it as a base pointer + vector of indices.
// Usually, the vector of pointers comes from a 'getelementptr' instruction.		// Usually, the vector of pointers comes from a 'getelementptr' instruction.
// The first operand of the GEP may be a single pointer or a vector of pointers		// The first operand of the GEP may be a single pointer or a vector of pointers
// Example:		// Example:
// %gep.ptr = getelementptr i32, <8 x i32*> %vptr, <8 x i32> %ind		// %gep.ptr = getelementptr i32, <8 x i32*> %vptr, <8 x i32> %ind
// or		// or
▲ Show 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	SDValue Gather = DAG.getMaskedGather(DAG.getVTList(VT, MVT::Other), VT, sdl,
Ops, MMO, IndexType);		Ops, MMO, IndexType);

SDValue OutChain = Gather.getValue(1);		SDValue OutChain = Gather.getValue(1);
if (!ConstantMemory)		if (!ConstantMemory)
PendingLoads.push_back(OutChain);		PendingLoads.push_back(OutChain);
setValue(&I, Gather);		setValue(&I, Gather);
}		}

		void SelectionDAGBuilder::visitGatherVP(const CallInst &I) {
		SDLoc sdl = getCurSDLoc();

		// @llvm.evl.gather.*(Ptrs, Mask, VLen)
		const Value *Ptr = I.getArgOperand(0);
		SDValue Mask = getValue(I.getArgOperand(1));
		SDValue VLen = getValue(I.getArgOperand(2));

		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());
		unsigned Alignment = I.getParamAlignment(0);
		if (!Alignment)
		Alignment = DAG.getEVTAlignment(VT);

		AAMDNodes AAInfo;
		I.getAAMetadata(AAInfo);
		const MDNode *Ranges = I.getMetadata(LLVMContext::MD_range);

		SDValue Root = DAG.getRoot();
		SDValue Base;
		SDValue Index;
		ISD::MemIndexType IndexType;
		SDValue Scale;
		const Value *BasePtr = Ptr;
		bool UniformBase = getUniformBase(BasePtr, Base, Index, IndexType, Scale, this);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool UniformBase = getUniformBase(BasePtr, Base, Index, IndexType, Scale, this); + bool UniformBase = + getUniformBase(BasePtr, Base, Index, IndexType, Scale, this); Lint: Pre-merge checks: clang-format: please reformat the code ``` - bool UniformBase = getUniformBase(BasePtr, Base…
		bool ConstantMemory = false;
		if (UniformBase && AA &&
		AA->pointsToConstantMemory(
		MemoryLocation(BasePtr,
		LocationSize::precise(
		DAG.getDataLayout().getTypeStoreSize(I.getType())),
		AAInfo))) {
		// Do not serialize (non-volatile) loads of constant memory with anything.
		Root = DAG.getEntryNode();
		ConstantMemory = true;
		}

		MachineMemOperand *MMO =
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - MachineMemOperand MMO = - DAG.getMachineFunction(). - getMachineMemOperand(MachinePointerInfo(UniformBase ? BasePtr : nullptr), - MachineMemOperand::MOLoad, VT.getStoreSize(), - Alignment, AAInfo, Ranges); + MachineMemOperand MMO = DAG.getMachineFunction().getMachineMemOperand( + MachinePointerInfo(UniformBase ? BasePtr : nullptr), + MachineMemOperand::MOLoad, VT.getStoreSize(), Alignment, AAInfo, Ranges); Lint: Pre-merge checks: clang-format: please reformat the code ``` - MachineMemOperand *MMO = - DAG.
		DAG.getMachineFunction().
		getMachineMemOperand(MachinePointerInfo(UniformBase ? BasePtr : nullptr),
		MachineMemOperand::MOLoad, VT.getStoreSize(),
		Alignment, AAInfo, Ranges);

		if (!UniformBase) {
		Base = DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()));
		Index = getValue(Ptr);
		IndexType = ISD::SIGNED_SCALED;
		Scale = DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout()));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Scale = DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout())); + Scale = + DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout())); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Scale = DAG.getTargetConstant(1, sdl, TLI.
		}
		SDValue Ops[] = { Root, Base, Index, Scale, Mask, VLen };
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Ops[] = { Root, Base, Index, Scale, Mask, VLen }; - SDValue Gather = DAG.getGatherVP(DAG.getVTList(VT, MVT::Other), VT, sdl, Ops, MMO, IndexType); + SDValue Ops[] = {Root, Base, Index, Scale, Mask, VLen}; + SDValue Gather = DAG.getGatherVP(DAG.getVTList(VT, MVT::Other), VT, sdl, Ops, + MMO, IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Ops[] = { Root, Base, Index, Scale, Mask…
		SDValue Gather = DAG.getGatherVP(DAG.getVTList(VT, MVT::Other), VT, sdl, Ops, MMO, IndexType);

		SDValue OutChain = Gather.getValue(1);
		if (!ConstantMemory)
		PendingLoads.push_back(OutChain);
		setValue(&I, Gather);
		}

		void SelectionDAGBuilder::visitScatterVP(const CallInst &I) {
		SDLoc sdl = getCurSDLoc();

		// llvm.evl.scatter.*(Src0, Ptrs, Mask, VLen)
		const Value *Ptr = I.getArgOperand(1);
		SDValue Src0 = getValue(I.getArgOperand(0));
		SDValue Mask = getValue(I.getArgOperand(2));
		SDValue VLen = getValue(I.getArgOperand(3));
		EVT VT = Src0.getValueType();
		unsigned Alignment = I.getParamAlignment(1);
		if (!Alignment)
		Alignment = DAG.getEVTAlignment(VT);
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();

		AAMDNodes AAInfo;
		I.getAAMetadata(AAInfo);

		SDValue Base;
		SDValue Index;
		ISD::MemIndexType IndexType;
		SDValue Scale;
		const Value *BasePtr = Ptr;
		bool UniformBase = getUniformBase(BasePtr, Base, Index, IndexType, Scale, this);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - bool UniformBase = getUniformBase(BasePtr, Base, Index, IndexType, Scale, this); + bool UniformBase = + getUniformBase(BasePtr, Base, Index, IndexType, Scale, this); Lint: Pre-merge checks: clang-format: please reformat the code ``` - bool UniformBase = getUniformBase(BasePtr, Base…

		const Value *MemOpBasePtr = UniformBase ? BasePtr : nullptr;
		MachineMemOperand *MMO = DAG.getMachineFunction().
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - MachineMemOperand MMO = DAG.getMachineFunction(). - getMachineMemOperand(MachinePointerInfo(MemOpBasePtr), - MachineMemOperand::MOStore, VT.getStoreSize(), - Alignment, AAInfo); + MachineMemOperand MMO = DAG.getMachineFunction().getMachineMemOperand( + MachinePointerInfo(MemOpBasePtr), MachineMemOperand::MOStore, + VT.getStoreSize(), Alignment, AAInfo); Lint: Pre-merge checks: clang-format: please reformat the code ``` - MachineMemOperand *MMO = DAG.getMachineFunction().
		getMachineMemOperand(MachinePointerInfo(MemOpBasePtr),
		MachineMemOperand::MOStore, VT.getStoreSize(),
		Alignment, AAInfo);
		if (!UniformBase) {
		Base = DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()));
		Index = getValue(Ptr);
		IndexType = ISD::SIGNED_SCALED;
		Scale = DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout()));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Scale = DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout())); + Scale = + DAG.getTargetConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout())); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Scale = DAG.getTargetConstant(1, sdl, TLI.
		}
		SDValue Ops[] = { getRoot(), Src0, Base, Index, Scale, Mask, VLen };
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SDValue Ops[] = { getRoot(), Src0, Base, Index, Scale, Mask, VLen }; - SDValue Scatter = DAG.getScatterVP(DAG.getVTList(MVT::Other), VT, sdl, - Ops, MMO, IndexType); + SDValue Ops[] = {getRoot(), Src0, Base, Index, Scale, Mask, VLen}; + SDValue Scatter = + DAG.getScatterVP(DAG.getVTList(MVT::Other), VT, sdl, Ops, MMO, IndexType); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SDValue Ops[] = { getRoot(), Src0, Base, Index…
		SDValue Scatter = DAG.getScatterVP(DAG.getVTList(MVT::Other), VT, sdl,
		Ops, MMO, IndexType);
		DAG.setRoot(Scatter);
		setValue(&I, Scatter);
		}

		void SelectionDAGBuilder::visitLoadVP(const CallInst &I) {
		SDLoc sdl = getCurSDLoc();

		auto getMaskedLoadOps = [&](Value* &Ptr, Value* &Mask, Value* &VLen,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto getMaskedLoadOps = [&](Value* &Ptr, Value* &Mask, Value* &VLen, - unsigned& Alignment) { + auto getMaskedLoadOps = [&](Value &Ptr, Value &Mask, Value &VLen, + unsigned &Alignment) { Lint: Pre-merge checks:* clang-format: please reformat the code ``` - auto getMaskedLoadOps = [&](Value* &Ptr, Value*…
		unsigned& Alignment) {
		// @llvm.evl.load.*(Ptr, Mask, Vlen)
		Ptr = I.getArgOperand(0);
		Alignment = I.getParamAlignment(0);
		Mask = I.getArgOperand(1);
		VLen = I.getArgOperand(2);
		};

		Value PtrOperand, MaskOperand, *VLenOperand;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value PtrOperand, MaskOperand, VLenOperand; + Value PtrOperand, MaskOperand, VLenOperand; Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value PtrOperand, MaskOperand, *VLenOperand; +…
		unsigned Alignment;
		getMaskedLoadOps(PtrOperand, MaskOperand, VLenOperand, Alignment);

		SDValue Ptr = getValue(PtrOperand);
		SDValue VLen = getValue(VLenOperand);
		SDValue Mask = getValue(MaskOperand);

		// infer the return type
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		SmallVector<EVT, 4> ValValueVTs;
		ComputeValueVTs(TLI, DAG.getDataLayout(), I.getType(), ValValueVTs);
		EVT VT = ValValueVTs[0];
		assert((ValValueVTs.size() == 1) && "splitting not implemented");

		if (!Alignment)
		Alignment = DAG.getEVTAlignment(VT);

		AAMDNodes AAInfo;
		I.getAAMetadata(AAInfo);
		const MDNode *Ranges = I.getMetadata(LLVMContext::MD_range);

		// Do not serialize masked loads of constant memory with anything.
		bool AddToChain =
		!AA \|\| !AA->pointsToConstantMemory(MemoryLocation(
		PtrOperand,
		LocationSize::precise(
		DAG.getDataLayout().getTypeStoreSize(I.getType())),
		AAInfo));
		SDValue InChain = AddToChain ? DAG.getRoot() : DAG.getEntryNode();

		MachineMemOperand *MMO =
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - MachineMemOperand MMO = - DAG.getMachineFunction(). - getMachineMemOperand(MachinePointerInfo(PtrOperand), - MachineMemOperand::MOLoad, VT.getStoreSize(), - Alignment, AAInfo, Ranges); + MachineMemOperand MMO = DAG.getMachineFunction().getMachineMemOperand( + MachinePointerInfo(PtrOperand), MachineMemOperand::MOLoad, + VT.getStoreSize(), Alignment, AAInfo, Ranges); Lint: Pre-merge checks: clang-format: please reformat the code ``` - MachineMemOperand *MMO = - DAG.
		DAG.getMachineFunction().
		getMachineMemOperand(MachinePointerInfo(PtrOperand),
		MachineMemOperand::MOLoad, VT.getStoreSize(),
		Alignment, AAInfo, Ranges);

		SDValue Load = DAG.getLoadVP(VT, sdl, InChain, Ptr, Mask, VLen, VT, MMO,
		ISD::NON_EXTLOAD);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ISD::NON_EXTLOAD); + ISD::NON_EXTLOAD); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ISD…
		if (AddToChain)
		PendingLoads.push_back(Load.getValue(1));
		setValue(&I, Load);
		}

void SelectionDAGBuilder::visitAtomicCmpXchg(const AtomicCmpXchgInst &I) {		void SelectionDAGBuilder::visitAtomicCmpXchg(const AtomicCmpXchgInst &I) {
SDLoc dl = getCurSDLoc();		SDLoc dl = getCurSDLoc();
AtomicOrdering SuccessOrdering = I.getSuccessOrdering();		AtomicOrdering SuccessOrdering = I.getSuccessOrdering();
AtomicOrdering FailureOrdering = I.getFailureOrdering();		AtomicOrdering FailureOrdering = I.getFailureOrdering();
SyncScope::ID SSID = I.getSyncScopeID();		SyncScope::ID SSID = I.getSyncScopeID();

SDValue InChain = getRoot();		SDValue InChain = getRoot();

▲ Show 20 Lines • Show All 1,651 Lines • ▼ Show 20 Lines	setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return;		return;
#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \		#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \
case Intrinsic::INTRINSIC:		case Intrinsic::INTRINSIC:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(I));		visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(I));
return;		return;

		#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS) case Intrinsic::VPID:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS) case Intrinsic::VPID: +#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS) case Intrinsic::VPID: Lint: Pre-merge checks: clang-format: please reformat the code ``` -#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS)…
		#include "llvm/IR/VPIntrinsics.def"
		visitVectorPredicationIntrinsic(cast<VPIntrinsic>(I));
		return;

case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());
if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&		if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&
TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT)) {		TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT)) {
setValue(&I, DAG.getNode(ISD::FMA, sdl,		setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(0)).getValueType(),		getValue(I.getArgOperand(0)).getValueType(),
getValue(I.getArgOperand(0)),		getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
▲ Show 20 Lines • Show All 823 Lines • ▼ Show 20 Lines	#include "llvm/IR/ConstrainedOps.def"

SDValue Result = DAG.getNode(Opcode, sdl, VTs, Opers);		SDValue Result = DAG.getNode(Opcode, sdl, VTs, Opers);
pushOutChain(Result, EB);		pushOutChain(Result, EB);

SDValue FPResult = Result.getValue(0);		SDValue FPResult = Result.getValue(0);
setValue(&FPI, FPResult);		setValue(&FPI, FPResult);
}		}

		void SelectionDAGBuilder::visitCmpVP(const VPIntrinsic &I) {
		ISD::CondCode Condition;
		CmpInst::Predicate predicate = I.getCmpPredicate();
		bool IsFP = I.getOperand(0)->getType()->isFPOrFPVectorTy();
		if (IsFP) {
		Condition = getFCmpCondCode(predicate);
		auto *FPMO = dyn_cast<FPMathOperator>(&I);
		if ((FPMO && FPMO->hasNoNaNs()) \|\| TM.Options.NoNaNsFPMath)
		Condition = getFCmpCodeWithoutNaN(Condition);

		} else {
		Condition = getICmpCondCode(predicate);
		}

		SDValue Op1 = getValue(I.getOperand(0));
		SDValue Op2 = getValue(I.getOperand(1));
		// #2 is the condition code
		SDValue MaskOp = getValue(I.getOperand(3));
		SDValue LenOp = getValue(I.getOperand(4));

		EVT DestVT = DAG.getTargetLoweringInfo().getValueType(DAG.getDataLayout(),
		I.getType());
		setValue(&I, DAG.getVPSetCC(getCurSDLoc(), DestVT, Op1, Op2, Condition, MaskOp, LenOp));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - setValue(&I, DAG.getVPSetCC(getCurSDLoc(), DestVT, Op1, Op2, Condition, MaskOp, LenOp)); + setValue(&I, DAG.getVPSetCC(getCurSDLoc(), DestVT, Op1, Op2, Condition, + MaskOp, LenOp)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - setValue(&I, DAG.getVPSetCC(getCurSDLoc(), DestVT…
		}

		void SelectionDAGBuilder::visitVectorPredicationIntrinsic(
		const VPIntrinsic &VPIntrin) {
		SDLoc sdl = getCurSDLoc();
		unsigned Opcode;
		switch (VPIntrin.getIntrinsicID()) {
		default:
		llvm_unreachable("Unforeseen intrinsic"); // Can't reach here.

		case Intrinsic::vp_load:
		visitLoadVP(VPIntrin);
		return;
		case Intrinsic::vp_store:
		visitStoreVP(VPIntrin);
		return;
		case Intrinsic::vp_gather:
		visitGatherVP(VPIntrin);
		return;
		case Intrinsic::vp_scatter:
		visitScatterVP(VPIntrin);
		return;

		case Intrinsic::vp_fcmp:
		case Intrinsic::vp_icmp:
		visitCmpVP(VPIntrin);
		return;

		// Generic mappings
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // Generic mappings -#define HANDLE_VP_TO_SDNODE(VPID, NODEID) \ - case Intrinsic::VPID: Opcode = ISD::NODEID; break; + // Generic mappings +#define HANDLE_VP_TO_SDNODE(VPID, NODEID) \ + case Intrinsic::VPID: \ + Opcode = ISD::NODEID; \ + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - // Generic mappings -#define HANDLE_VP_TO_SDNODE…
		#define HANDLE_VP_TO_SDNODE(VPID, NODEID) \
		case Intrinsic::VPID: Opcode = ISD::NODEID; break;
		#include "llvm/IR/VPIntrinsics.def"
		}

		// TODO memory evl: SDValue Chain = getRoot();

		SmallVector<EVT, 4> ValueVTs;
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		ComputeValueVTs(TLI, DAG.getDataLayout(), VPIntrin.getType(), ValueVTs);
		SDVTList VTs = DAG.getVTList(ValueVTs);

		// ValueVTs.push_back(MVT::Other); // Out chain

		// Request Operands
		SmallVector<SDValue,7> OpValues;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SmallVector<SDValue,7> OpValues; - auto ExceptPosOpt = VPIntrinsic::GetExceptionBehaviorParamPos(VPIntrin.getIntrinsicID()); - auto RoundingModePosOpt = VPIntrinsic::GetRoundingModeParamPos(VPIntrin.getIntrinsicID()); - for (int i = 0; i < (int) VPIntrin.getNumArgOperands(); ++i) { - if (ExceptPosOpt && (i == ExceptPosOpt.getValue())) continue; - if (RoundingModePosOpt && (i == RoundingModePosOpt.getValue())) continue; + SmallVector<SDValue, 7> OpValues; + auto ExceptPosOpt = + VPIntrinsic::GetExceptionBehaviorParamPos(VPIntrin.getIntrinsicID()); + auto RoundingModePosOpt = + VPIntrinsic::GetRoundingModeParamPos(VPIntrin.getIntrinsicID()); + for (int i = 0; i < (int)VPIntrin.getNumArgOperands(); ++i) { + if (ExceptPosOpt && (i == ExceptPosOpt.getValue())) + continue; + if (RoundingModePosOpt && (i == RoundingModePosOpt.getValue())) + continue; Lint: Pre-merge checks: clang-format: please reformat the code ``` - SmallVector<SDValue,7> OpValues; - auto…
		auto ExceptPosOpt = VPIntrinsic::GetExceptionBehaviorParamPos(VPIntrin.getIntrinsicID());
		auto RoundingModePosOpt = VPIntrinsic::GetRoundingModeParamPos(VPIntrin.getIntrinsicID());
		for (int i = 0; i < (int) VPIntrin.getNumArgOperands(); ++i) {
		if (ExceptPosOpt && (i == ExceptPosOpt.getValue())) continue;
		if (RoundingModePosOpt && (i == RoundingModePosOpt.getValue())) continue;
		OpValues.push_back(getValue(VPIntrin.getArgOperand(i)));
		}
		SDValue Result = DAG.getNode(Opcode, sdl, VTs, OpValues);


		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
		SDNodeFlags NodeFlags;

		// set exception flags where appropriate
		NodeFlags.setNoFPExcept(!VPIntrin.isConstrainedOp());

		// copy FMF where available
		auto * FPIntrin = dyn_cast<FPMathOperator>(&VPIntrin);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto * FPIntrin = dyn_cast<FPMathOperator>(&VPIntrin); - if (FPIntrin) NodeFlags.copyFMF(FPIntrin); + auto FPIntrin = dyn_cast<FPMathOperator>(&VPIntrin); + if (FPIntrin) + NodeFlags.copyFMF(FPIntrin); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - auto * FPIntrin = dyn_cast<FPMathOperator>…
		if (FPIntrin) NodeFlags.copyFMF(*FPIntrin);

		if (VPIntrin.isReductionOp()) {
		NodeFlags.setVectorReduction(true);
		}

		// Attach chain
		SDValue VPResult;
		if (Result.getNode()->getNumValues() == 2) {
		SDValue OutChain = Result.getValue(1);
		DAG.setRoot(OutChain);
		VPResult = Result.getValue(0);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - VPResult = Result.getValue(0); + VPResult = Result.getValue(0); Lint: Pre-merge checks: clang-format: please reformat the code ``` - VPResult = Result.getValue(0); + VPResult =…
		} else {
		VPResult = Result;
		}

		// attach flags and return
		if (NodeFlags.isDefined()) VPResult.getNode()->setFlags(NodeFlags);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (NodeFlags.isDefined()) VPResult.getNode()->setFlags(NodeFlags); + if (NodeFlags.isDefined()) + VPResult.getNode()->setFlags(NodeFlags); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (NodeFlags.isDefined()) VPResult.getNode()…
		setValue(&VPIntrin, VPResult);
		}

std::pair<SDValue, SDValue>		std::pair<SDValue, SDValue>
SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,		SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,
const BasicBlock *EHPadBB) {		const BasicBlock *EHPadBB) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
MCSymbol *BeginLabel = nullptr;		MCSymbol *BeginLabel = nullptr;

if (EHPadBB) {		if (EHPadBB) {
▲ Show 20 Lines • Show All 3,559 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 447 Lines • ▼ Show 20 Lines	#endif
case ISD::VECREDUCE_OR: return "vecreduce_or";		case ISD::VECREDUCE_OR: return "vecreduce_or";
case ISD::VECREDUCE_XOR: return "vecreduce_xor";		case ISD::VECREDUCE_XOR: return "vecreduce_xor";
case ISD::VECREDUCE_SMAX: return "vecreduce_smax";		case ISD::VECREDUCE_SMAX: return "vecreduce_smax";
case ISD::VECREDUCE_SMIN: return "vecreduce_smin";		case ISD::VECREDUCE_SMIN: return "vecreduce_smin";
case ISD::VECREDUCE_UMAX: return "vecreduce_umax";		case ISD::VECREDUCE_UMAX: return "vecreduce_umax";
case ISD::VECREDUCE_UMIN: return "vecreduce_umin";		case ISD::VECREDUCE_UMIN: return "vecreduce_umin";
case ISD::VECREDUCE_FMAX: return "vecreduce_fmax";		case ISD::VECREDUCE_FMAX: return "vecreduce_fmax";
case ISD::VECREDUCE_FMIN: return "vecreduce_fmin";		case ISD::VECREDUCE_FMIN: return "vecreduce_fmin";

		// Vector Predication
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // Vector Predication -#define REGISTER_VP_SDNODE(NODEID,NAME,MASKPOS,VLENPOS) \ - case ISD::NODEID: return NAME; + // Vector Predication +#define REGISTER_VP_SDNODE(NODEID, NAME, MASKPOS, VLENPOS) \ + case ISD::NODEID: \ + return NAME; Lint: Pre-merge checks: clang-format: please reformat the code ``` - // Vector Predication -#define REGISTER_VP_SDNODE…
		#define REGISTER_VP_SDNODE(NODEID,NAME,MASKPOS,VLENPOS) \
		case ISD::NODEID: return NAME;
		#include "llvm/IR/VPIntrinsics.def"
}		}
}		}

const char *SDNode::getIndexedModeName(ISD::MemIndexedMode AM) {		const char *SDNode::getIndexedModeName(ISD::MemIndexedMode AM) {
switch (AM) {		switch (AM) {
default: return "";		default: return "";
case ISD::PRE_INC: return "<pre-inc>";		case ISD::PRE_INC: return "<pre-inc>";
case ISD::PRE_DEC: return "<pre-dec>";		case ISD::PRE_DEC: return "<pre-dec>";
▲ Show 20 Lines • Show All 525 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 667 Lines • ▼ Show 20 Lines	if (getOptLevel() != CodeGenOpt::None && !DisableConstantHoisting)
addPass(createConstantHoistingPass());		addPass(createConstantHoistingPass());

if (getOptLevel() != CodeGenOpt::None && !DisablePartialLibcallInlining)		if (getOptLevel() != CodeGenOpt::None && !DisablePartialLibcallInlining)
addPass(createPartiallyInlineLibCallsPass());		addPass(createPartiallyInlineLibCallsPass());

// Instrument function entry and exit, e.g. with calls to mcount().		// Instrument function entry and exit, e.g. with calls to mcount().
addPass(createPostInlineEntryExitInstrumenterPass());		addPass(createPostInlineEntryExitInstrumenterPass());

		// Expand vector predication intrinsics into standard IR instructions.
		// This pass has to run before ScalarizeMaskedMemIntrin and ExpandReduction
		// passes since it emits those kinds of intrinsics.
		addPass(createExpandVectorPredicationPass());

// Add scalarization of target's unsupported masked memory intrinsics pass.		// Add scalarization of target's unsupported masked memory intrinsics pass.
// the unsupported intrinsic will be replaced with a chain of basic blocks,		// the unsupported intrinsic will be replaced with a chain of basic blocks,
// that stores/loads element one-by-one if the appropriate mask bit is set.		// that stores/loads element one-by-one if the appropriate mask bit is set.
addPass(createScalarizeMaskedMemIntrinPass());		addPass(createScalarizeMaskedMemIntrinPass());

// Expand reduction intrinsics into shuffle sequences if the target wants to.		// Expand reduction intrinsics into shuffle sequences if the target wants to.
addPass(createExpandReductionsPass());		addPass(createExpandReductionsPass());
}		}
▲ Show 20 Lines • Show All 557 Lines • Show Last 20 Lines

llvm/lib/IR/Attributes.cpp

Show First 20 Lines • Show All 315 Lines • ▼ Show 20 Lines	std::string Attribute::getAsString(bool InAttrGrp) const {
if (hasAttribute(Attribute::AlwaysInline))		if (hasAttribute(Attribute::AlwaysInline))
return "alwaysinline";		return "alwaysinline";
if (hasAttribute(Attribute::ArgMemOnly))		if (hasAttribute(Attribute::ArgMemOnly))
return "argmemonly";		return "argmemonly";
if (hasAttribute(Attribute::Builtin))		if (hasAttribute(Attribute::Builtin))
return "builtin";		return "builtin";
if (hasAttribute(Attribute::Convergent))		if (hasAttribute(Attribute::Convergent))
return "convergent";		return "convergent";
		if (hasAttribute(Attribute::VectorLength))
		return "vlen";
if (hasAttribute(Attribute::SwiftError))		if (hasAttribute(Attribute::SwiftError))
return "swifterror";		return "swifterror";
if (hasAttribute(Attribute::SwiftSelf))		if (hasAttribute(Attribute::SwiftSelf))
return "swiftself";		return "swiftself";
if (hasAttribute(Attribute::InaccessibleMemOnly))		if (hasAttribute(Attribute::InaccessibleMemOnly))
return "inaccessiblememonly";		return "inaccessiblememonly";
if (hasAttribute(Attribute::InaccessibleMemOrArgMemOnly))		if (hasAttribute(Attribute::InaccessibleMemOrArgMemOnly))
return "inaccessiblemem_or_argmemonly";		return "inaccessiblemem_or_argmemonly";
if (hasAttribute(Attribute::InAlloca))		if (hasAttribute(Attribute::InAlloca))
return "inalloca";		return "inalloca";
if (hasAttribute(Attribute::InlineHint))		if (hasAttribute(Attribute::InlineHint))
return "inlinehint";		return "inlinehint";
if (hasAttribute(Attribute::InReg))		if (hasAttribute(Attribute::InReg))
return "inreg";		return "inreg";
if (hasAttribute(Attribute::JumpTable))		if (hasAttribute(Attribute::JumpTable))
return "jumptable";		return "jumptable";
		if (hasAttribute(Attribute::Mask))
		return "mask";
		if (hasAttribute(Attribute::Passthru))
		return "passthru";
if (hasAttribute(Attribute::MinSize))		if (hasAttribute(Attribute::MinSize))
return "minsize";		return "minsize";
if (hasAttribute(Attribute::Naked))		if (hasAttribute(Attribute::Naked))
return "naked";		return "naked";
if (hasAttribute(Attribute::Nest))		if (hasAttribute(Attribute::Nest))
return "nest";		return "nest";
if (hasAttribute(Attribute::NoAlias))		if (hasAttribute(Attribute::NoAlias))
return "noalias";		return "noalias";
▲ Show 20 Lines • Show All 1,588 Lines • Show Last 20 Lines

llvm/lib/IR/CMakeLists.txt

Show All 38 Lines	add_llvm_component_library(LLVMCore
ModuleSummaryIndex.cpp		ModuleSummaryIndex.cpp
Operator.cpp		Operator.cpp
OptBisect.cpp		OptBisect.cpp
Pass.cpp		Pass.cpp
PassInstrumentation.cpp		PassInstrumentation.cpp
PassManager.cpp		PassManager.cpp
PassRegistry.cpp		PassRegistry.cpp
PassTimingInfo.cpp		PassTimingInfo.cpp
		PredicatedInst.cpp
SafepointIRVerifier.cpp		SafepointIRVerifier.cpp
ProfileSummary.cpp		ProfileSummary.cpp
Statepoint.cpp		Statepoint.cpp
Type.cpp		Type.cpp
TypeFinder.cpp		TypeFinder.cpp
Use.cpp		Use.cpp
User.cpp		User.cpp
		VPBuilder.cpp
Value.cpp		Value.cpp
ValueSymbolTable.cpp		ValueSymbolTable.cpp
Verifier.cpp		Verifier.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${LLVM_MAIN_INCLUDE_DIR}/llvm/IR		${LLVM_MAIN_INCLUDE_DIR}/llvm/IR

LINK_LIBS ${LLVM_PTHREAD_LIB}		LINK_LIBS ${LLVM_PTHREAD_LIB}

DEPENDS		DEPENDS
intrinsics_gen		intrinsics_gen
)		)

llvm/lib/IR/FPEnv.cpp

//===-- FPEnv.cpp ---- FP Environment -------------------------------------===//		//===-- FPEnv.cpp ---- FP Environment -------------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
/// @file		/// @file
/// This file contains the implementations of entities that describe floating		/// This file contains the implementations of entities that describe floating
/// point environment.		/// point environment.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/StringSwitch.h"
#include "llvm/IR/FPEnv.h"		#include "llvm/IR/FPEnv.h"
		#include "llvm/ADT/StringSwitch.h"
		#include "llvm/IR/Metadata.h"

namespace llvm {		namespace llvm {

Optional<fp::RoundingMode> StrToRoundingMode(StringRef RoundingArg) {		Optional<fp::RoundingMode> StrToRoundingMode(StringRef RoundingArg) {
// For dynamic rounding mode, we use round to nearest but we will set the		// For dynamic rounding mode, we use round to nearest but we will set the
// 'exact' SDNodeFlag so that the value will not be rounded.		// 'exact' SDNodeFlag so that the value will not be rounded.
return StringSwitch<Optional<fp::RoundingMode>>(RoundingArg)		return StringSwitch<Optional<fp::RoundingMode>>(RoundingArg)
.Case("round.dynamic", fp::rmDynamic)		.Case("round.dynamic", fp::rmDynamic)
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	case fp::ebIgnore:
break;		break;
case fp::ebMayTrap:		case fp::ebMayTrap:
ExceptStr = "fpexcept.maytrap";		ExceptStr = "fpexcept.maytrap";
break;		break;
}		}
return ExceptStr;		return ExceptStr;
}		}

		Value *GetConstrainedFPExcept(LLVMContext &Context,
		fp::ExceptionBehavior UseExcept) {
		Optional<StringRef> ExceptStr = ExceptionBehaviorToStr(UseExcept);
		assert(ExceptStr.hasValue() && "Garbage strict exception behavior!");
		auto *ExceptMDS = MDString::get(Context, ExceptStr.getValue());

		return MetadataAsValue::get(Context, ExceptMDS);
}		}

		Value *GetConstrainedFPRounding(LLVMContext &Context,
		fp::RoundingMode UseRounding) {
		Optional<StringRef> RoundingStr = RoundingModeToStr(UseRounding);
		assert(RoundingStr.hasValue() && "Garbage strict rounding mode!");
		auto *RoundingMDS = MDString::get(Context, RoundingStr.getValue());

		return MetadataAsValue::get(Context, RoundingMDS);
		}

		} // namespace llvm

llvm/lib/IR/IRBuilder.cpp

Show First 20 Lines • Show All 515 Lines • ▼ Show 20 Lines	CallInst *IRBuilderBase::CreateMaskedIntrinsic(Intrinsic::ID Id,
ArrayRef<Value *> Ops,		ArrayRef<Value *> Ops,
ArrayRef<Type *> OverloadedTypes,		ArrayRef<Type *> OverloadedTypes,
const Twine &Name) {		const Twine &Name) {
Module *M = BB->getParent()->getParent();		Module *M = BB->getParent()->getParent();
Function *TheFn = Intrinsic::getDeclaration(M, Id, OverloadedTypes);		Function *TheFn = Intrinsic::getDeclaration(M, Id, OverloadedTypes);
return createCallHelper(TheFn, Ops, this, Name);		return createCallHelper(TheFn, Ops, this, Name);
}		}

		/// Create a call to a vector-predicated intrinsic (VP).
		/// \p OC - The LLVM IR Opcode of the operation
		/// \p VecOpArray - Intrinsic operand list
		/// \p FMFSource - Copy source for Fast Math Flags
		/// \p Name - name of the result variable
		Instruction *IRBuilderBase::CreateVectorPredicatedInst(unsigned OC,
		ArrayRef<Value *> Params,
		Instruction *FMFSource,
		const Twine &Name) {

		Module *M = BB->getParent()->getParent();

		Intrinsic::ID VPID = VPIntrinsic::GetForOpcode(OC);
		auto VPFunc = VPIntrinsic::GetDeclarationForParams(M, VPID, Params);
		auto *VPCall = createCallHelper(VPFunc, Params, this, Name);

		// transfer fast math flags
		if (FMFSource && isa<FPMathOperator>(FMFSource)) {
		VPCall->copyFastMathFlags(FMFSource);
		}

		return VPCall;
		}

		/// Create a call to a vector-predicated comparison intrinsic (VP).
		/// \p Pred - comparison predicate
		/// \p FirstOp - First vector operand
		/// \p SndOp - Second vector operand
		/// \p Mask - Mask operand
		/// \p VectorLength - Vector length operand
		/// \p Name - name of the result variable
		Instruction *IRBuilderBase::CreateVectorPredicatedCmp(
		CmpInst::Predicate Pred, Value FirstParam, Value SndParam,
		Value MaskParam, Value VectorLengthParam, const Twine &Name) {

		Module *M = BB->getParent()->getParent();

		// encode comparison predicate as MD
		uint8_t RawPred = static_cast<uint8_t>(Pred);
		auto Int8Ty = Type::getInt8Ty(getContext());
		auto PredParam = ConstantInt::get(Int8Ty, RawPred, false);

		Intrinsic::ID VPID = FirstParam->getType()->isIntOrIntVectorTy()
		? Intrinsic::vp_icmp
		: Intrinsic::vp_fcmp;

		auto VPFunc = VPIntrinsic::GetDeclarationForParams(
		M, VPID, {FirstParam, SndParam, PredParam, MaskParam, VectorLengthParam});

		return createCallHelper(
		VPFunc, {FirstParam, SndParam, PredParam, MaskParam, VectorLengthParam},
		this, Name);
		}

/// Create a call to a Masked Gather intrinsic.		/// Create a call to a Masked Gather intrinsic.
/// \p Ptrs - vector of pointers for loading		/// \p Ptrs - vector of pointers for loading
/// \p Align - alignment for one element		/// \p Align - alignment for one element
/// \p Mask - vector of booleans which indicates what vector lanes should		/// \p Mask - vector of booleans which indicates what vector lanes should
/// be accessed in memory		/// be accessed in memory
/// \p PassThru - pass-through value that is used to fill the masked-off lanes		/// \p PassThru - pass-through value that is used to fill the masked-off lanes
/// of the result		/// of the result
/// \p Name - name of the result variable		/// \p Name - name of the result variable
▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines

llvm/lib/IR/IntrinsicInst.cpp

Show All 15 Lines
//		//
// In some cases, arguments to intrinsics need to be generic and are defined as		// In some cases, arguments to intrinsics need to be generic and are defined as
// type pointer to empty struct { }*. To access the real item of interest the		// type pointer to empty struct { }*. To access the real item of interest the
// cast instruction needs to be stripped away.		// cast instruction needs to be stripped away.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Operator.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
		#include "llvm/IR/Operator.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
using namespace llvm;		using namespace llvm;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// DbgVariableIntrinsic - This is the common base class for debug info		/// DbgVariableIntrinsic - This is the common base class for debug info
/// intrinsics for variables.		/// intrinsics for variables.
///		///

▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	Optional<fp::RoundingMode> ConstrainedFPIntrinsic::getRoundingMode() const {
if (!MD \|\| !isa<MDString>(MD))		if (!MD \|\| !isa<MDString>(MD))
return None;		return None;
return StrToRoundingMode(cast<MDString>(MD)->getString());		return StrToRoundingMode(cast<MDString>(MD)->getString());
}		}

Optional<fp::ExceptionBehavior>		Optional<fp::ExceptionBehavior>
ConstrainedFPIntrinsic::getExceptionBehavior() const {		ConstrainedFPIntrinsic::getExceptionBehavior() const {
unsigned NumOperands = getNumArgOperands();		unsigned NumOperands = getNumArgOperands();
		assert(NumOperands >= 1 && "underflow");
Metadata *MD =		Metadata *MD =
cast<MetadataAsValue>(getArgOperand(NumOperands - 1))->getMetadata();		cast<MetadataAsValue>(getArgOperand(NumOperands - 1))->getMetadata();
if (!MD \|\| !isa<MDString>(MD))		if (!MD \|\| !isa<MDString>(MD))
return None;		return None;
return StrToExceptionBehavior(cast<MDString>(MD)->getString());		return StrToExceptionBehavior(cast<MDString>(MD)->getString());
}		}

FCmpInst::Predicate		FCmpInst::Predicate
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	#define INSTRUCTION(NAME, NARGS, ROUND_MODE, INTRINSIC) \
case Intrinsic::INTRINSIC:		case Intrinsic::INTRINSIC:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
return true;		return true;
default:		default:
return false;		return false;
}		}
}		}

		ElementCount VPIntrinsic::getVectorLength() const {
		auto GetVectorLengthOfType = [](const Type *T) -> ElementCount {
		auto VT = cast<VectorType>(T);
		auto ElemCount = VT->getElementCount();
		return ElemCount;
		};

		auto VPMask = getMaskParam();
		if (VPMask) {
		return GetVectorLengthOfType(VPMask->getType());
		}

		// only compose does not have a mask param
		assert(getIntrinsicID() == Intrinsic::vp_compose);
		return GetVectorLengthOfType(getType());
		}

		void VPIntrinsic::setMaskParam(Value *NewMask) {
		auto MaskPos = GetMaskParamPos(getIntrinsicID());
		assert(MaskPos.hasValue());
		this->setOperand(MaskPos.getValue(), NewMask);
		}

		void VPIntrinsic::setVectorLengthParam(Value *NewVL) {
		auto VLPos = GetVectorLengthParamPos(getIntrinsicID());
		assert(VLPos.hasValue());
		this->setOperand(VLPos.getValue(), NewVL);
		}

		Value *VPIntrinsic::getMaskParam() const {
		auto maskPos = GetMaskParamPos(getIntrinsicID());
		if (maskPos)
		return getArgOperand(maskPos.getValue());
		return nullptr;
		}

		Value *VPIntrinsic::getVectorLengthParam() const {
		auto vlenPos = GetVectorLengthParamPos(getIntrinsicID());
		if (vlenPos)
		return getArgOperand(vlenPos.getValue());
		return nullptr;
		}

		Optional<int> VPIntrinsic::GetMaskParamPos(Intrinsic::ID IntrinsicID) {
		switch (IntrinsicID) {
		default:
		return None;

		#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS) \
		case Intrinsic::VPID: \
		return MASKPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Optional<int> VPIntrinsic::GetVectorLengthParamPos(Intrinsic::ID IntrinsicID) {
		switch (IntrinsicID) {
		default:
		return None;

		#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS) \
		case Intrinsic::VPID: \
		return VLENPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool VPIntrinsic::IsVPIntrinsic(Intrinsic::ID ID) {
		switch (ID) {
		default:
		return false;

		#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS) \
		case Intrinsic::VPID: \
		break;
		#include "llvm/IR/VPIntrinsics.def"
		}
		return true;
		}

		Intrinsic::ID VPIntrinsic::GetConstrainedIntrinsicForVP(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return Intrinsic::not_intrinsic;

		#define HANDLE_VP_TO_CONSTRAINED_INTRIN(VPID, CFPID) \
		case Intrinsic::VPID: \
		return Intrinsic::CFPID;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Intrinsic::ID VPIntrinsic::GetFunctionalIntrinsicForVP(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return Intrinsic::not_intrinsic;

		#define HANDLE_VP_TO_INTRIN(VPID, IID) \
		case Intrinsic::VPID: \
		return Intrinsic::IID;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		// Equivalent non-predicated opcode
		unsigned VPIntrinsic::GetFunctionalOpcodeForVP(Intrinsic::ID ID) {
		switch (ID) {
		default:
		return Instruction::Call;

		#define HANDLE_VP_TO_OC(VPID, OC) \
		case Intrinsic::VPID: \
		return Instruction::OC;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Intrinsic::ID VPIntrinsic::GetForOpcode(unsigned OC) {
		switch (OC) {
		default:
		return Intrinsic::not_intrinsic;

		#define HANDLE_VP_TO_OC(VPID, OC) \
		case Instruction::OC: \
		return Intrinsic::VPID;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool VPIntrinsic::canIgnoreVectorLengthParam() const {
		// No vlen param - no lanes masked-off by it.
		auto *VLParam = getVectorLengthParam();
		if (!VLParam)
		return true;

		// Can ignore if MSB of vlen is set.
		auto VLConst = dyn_cast<ConstantInt>(VLParam);
		if (VLConst && VLConst->getSExtValue() < 0)
		return true;

		// Vlen param greater-equal type vlen - no lanes masked-off.
		if (VLConst) {
		auto ElemCount = getVectorLength();
		if (ElemCount.Scalable)
		return false;

		uint64_t VLNum = VLConst->getZExtValue();
		if (VLNum >= ElemCount.Min)
		return true;
		}

		// Cannot ignore vlen param by default.
		return false;
		}

		CmpInst::Predicate VPIntrinsic::getCmpPredicate() const {
		return static_cast<CmpInst::Predicate>(
		cast<ConstantInt>(getArgOperand(2))->getZExtValue());
		}

		Optional<fp::RoundingMode> VPIntrinsic::getRoundingMode() const {
		auto RmParamPos = GetRoundingModeParamPos(getIntrinsicID());
		if (!RmParamPos)
		return None;

		Metadata *MD = dyn_cast<MetadataAsValue>(getArgOperand(RmParamPos.getValue()))
		->getMetadata();
		if (!MD \|\| !isa<MDString>(MD))
		return None;
		StringRef RoundingArg = cast<MDString>(MD)->getString();
		return StrToRoundingMode(RoundingArg);
		}

		Optional<fp::ExceptionBehavior> VPIntrinsic::getExceptionBehavior() const {
		auto EbParamPos = GetExceptionBehaviorParamPos(getIntrinsicID());
		if (!EbParamPos)
		return None;

		Metadata *MD = dyn_cast<MetadataAsValue>(getArgOperand(EbParamPos.getValue()))
		->getMetadata();
		if (!MD \|\| !isa<MDString>(MD))
		return None;
		StringRef ExceptionArg = cast<MDString>(MD)->getString();
		return StrToExceptionBehavior(ExceptionArg);
		}

		/// \return The vector to reduce if this is a reduction operation.
		Value *VPIntrinsic::getReductionVectorParam() const {
		auto PosOpt = GetReductionVectorParamPos(getIntrinsicID());
		if (!PosOpt.hasValue())
		return nullptr;
		return getArgOperand(PosOpt.getValue());
		}

		Optional<int> VPIntrinsic::GetReductionVectorParamPos(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return None;

		#define HANDLE_VP_REDUCTION(VPID, ACCUPOS, VECTORPOS) \
		case Intrinsic::VPID: \
		return VECTORPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		/// \return The accumulator initial value if this is a reduction operation.
		Value *VPIntrinsic::getReductionAccuParam() const {
		auto PosOpt = GetReductionAccuParamPos(getIntrinsicID());
		if (!PosOpt.hasValue())
		return nullptr;
		return getArgOperand(PosOpt.getValue());
		}

		Optional<int> VPIntrinsic::GetReductionAccuParamPos(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return None;

		#define HANDLE_VP_REDUCTION(VPID, ACCUPOS, VECTORPOS) \
		case Intrinsic::VPID: \
		return ACCUPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		/// \return the alignment of the pointer used by this load/store/gather or
		/// scatter.
		MaybeAlign VPIntrinsic::getPointerAlignment() const {
		Optional<int> PtrParamOpt = GetMemoryPointerParamPos(getIntrinsicID());
		assert(PtrParamOpt.hasValue() && "no pointer argument!");
		unsigned AlignVal = this->getParamAlignment(PtrParamOpt.getValue());
		if (AlignVal) {
		return MaybeAlign(AlignVal);
		}
		return None;
		}

		/// \return The pointer operand of this load,store, gather or scatter.
		Value *VPIntrinsic::getMemoryPointerParam() const {
		auto PtrParamOpt = GetMemoryPointerParamPos(getIntrinsicID());
		if (!PtrParamOpt.hasValue())
		return nullptr;
		return getArgOperand(PtrParamOpt.getValue());
		}

		Optional<int> VPIntrinsic::GetMemoryPointerParamPos(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return None;

		#define HANDLE_VP_IS_MEMOP(VPID, POINTERPOS, DATAPOS) \
		case Intrinsic::VPID: \
		return POINTERPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		/// \return The data (payload) operand of this store or scatter.
		Value *VPIntrinsic::getMemoryDataParam() const {
		auto DataParamOpt = GetMemoryDataParamPos(getIntrinsicID());
		if (!DataParamOpt.hasValue())
		return nullptr;
		return getArgOperand(DataParamOpt.getValue());
		}

		Optional<int> VPIntrinsic::GetMemoryDataParamPos(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return None;

		#define HANDLE_VP_IS_MEMOP(VPID, POINTERPOS, DATAPOS) \
		case Intrinsic::VPID: \
		return DATAPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Function VPIntrinsic::GetDeclarationForParams(Module M, Intrinsic::ID VPID,
		ArrayRef<Value *> Params,
		Type *VecRetTy) {
		assert(VPID != Intrinsic::not_intrinsic && "todo dispatch to default insts");

		bool IsArithOp = VPIntrinsic::IsBinaryVPOp(VPID) \|\|
		VPIntrinsic::IsUnaryVPOp(VPID) \|\|
		VPIntrinsic::IsTernaryVPOp(VPID);
		bool IsCmpOp = (VPID == Intrinsic::vp_icmp) \|\| (VPID == Intrinsic::vp_fcmp);
		bool IsReduceOp = VPIntrinsic::IsVPReduction(VPID);
		bool IsShuffleOp =
		(VPID == Intrinsic::vp_compress) \|\| (VPID == Intrinsic::vp_expand) \|\|
		(VPID == Intrinsic::vp_vshift) \|\| (VPID == Intrinsic::vp_select) \|\|
		(VPID == Intrinsic::vp_compose);
		bool IsMemoryOp =
		(VPID == Intrinsic::vp_store) \|\| (VPID == Intrinsic::vp_load) \|\|
		(VPID == Intrinsic::vp_store) \|\| (VPID == Intrinsic::vp_load);
		bool IsCastOp =
		(VPID == Intrinsic::vp_fptosi) \|\| (VPID == Intrinsic::vp_fptoui) \|\|
		(VPID == Intrinsic::vp_sitofp) \|\| (VPID == Intrinsic::vp_uitofp) \|\|
		(VPID == Intrinsic::vp_fpext) \|\| (VPID == Intrinsic::vp_fptrunc);

		Type *VecTy = nullptr;
		Type *VecPtrTy = nullptr;

		if (IsArithOp \|\| IsCmpOp \|\| IsCastOp) {
		Value &FirstOp = *Params[0];

		// Fetch the VP intrinsic
		VecTy = cast<VectorType>(FirstOp.getType());

		} else if (IsReduceOp) {
		auto VectorPosOpt = GetReductionVectorParamPos(VPID);
		Value *VectorParam = Params[VectorPosOpt.getValue()];

		VecTy = VectorParam->getType();

		} else if (IsMemoryOp) {
		auto DataPosOpt = VPIntrinsic::GetMemoryDataParamPos(VPID);
		auto PtrPosOpt = VPIntrinsic::GetMemoryPointerParamPos(VPID);
		VecPtrTy = Params[PtrPosOpt.getValue()]->getType();

		if (DataPosOpt.hasValue()) {
		// store-kind operation
		VecTy = Params[DataPosOpt.getValue()]->getType();
		} else {
		// load-kind operation
		VecTy = VecPtrTy->getPointerElementType();
		}

		} else if (IsShuffleOp) {
		VecTy = (VPID == Intrinsic::vp_select) ? Params[1]->getType()
		: Params[0]->getType();
		}

		auto TypeTokens = VPIntrinsic::GetTypeTokens(VPID);
		auto *VPFunc = Intrinsic::getDeclaration(
		M, VPID,
		VPIntrinsic::EncodeTypeTokens(TypeTokens, VecRetTy, VecPtrTy, *VecTy));
		assert(VPFunc && "not a VP intrinsic");

		return VPFunc;
		}

		VPIntrinsic::TypeTokenVec VPIntrinsic::GetTypeTokens(Intrinsic::ID ID) {
		switch (ID) {
		default:
		llvm_unreachable("not implemented!");

		case Intrinsic::vp_cos:
		case Intrinsic::vp_sin:
		case Intrinsic::vp_exp:
		case Intrinsic::vp_exp2:

		case Intrinsic::vp_log:
		case Intrinsic::vp_log2:
		case Intrinsic::vp_log10:
		case Intrinsic::vp_sqrt:
		case Intrinsic::vp_ceil:
		case Intrinsic::vp_floor:
		case Intrinsic::vp_round:
		case Intrinsic::vp_trunc:
		case Intrinsic::vp_rint:
		case Intrinsic::vp_nearbyint:

		case Intrinsic::vp_and:
		case Intrinsic::vp_or:
		case Intrinsic::vp_xor:
		case Intrinsic::vp_ashr:
		case Intrinsic::vp_lshr:
		case Intrinsic::vp_shl:
		case Intrinsic::vp_add:
		case Intrinsic::vp_sub:
		case Intrinsic::vp_mul:
		case Intrinsic::vp_sdiv:
		case Intrinsic::vp_udiv:
		case Intrinsic::vp_srem:
		case Intrinsic::vp_urem:

		case Intrinsic::vp_fadd:
		case Intrinsic::vp_fsub:
		case Intrinsic::vp_fmul:
		case Intrinsic::vp_fdiv:
		case Intrinsic::vp_frem:
		case Intrinsic::vp_pow:
		case Intrinsic::vp_powi:
		case Intrinsic::vp_maxnum:
		case Intrinsic::vp_minnum:
		case Intrinsic::vp_vshift:
		return TypeTokenVec{VPTypeToken::Vector};

		case Intrinsic::vp_select:
		return TypeTokenVec{VPTypeToken::Returned};

		case Intrinsic::vp_reduce_and:
		case Intrinsic::vp_reduce_or:
		case Intrinsic::vp_reduce_xor:

		case Intrinsic::vp_reduce_add:
		case Intrinsic::vp_reduce_mul:
		case Intrinsic::vp_reduce_fadd:
		case Intrinsic::vp_reduce_fmul:

		case Intrinsic::vp_reduce_fmin:
		case Intrinsic::vp_reduce_fmax:
		case Intrinsic::vp_reduce_smin:
		case Intrinsic::vp_reduce_smax:
		case Intrinsic::vp_reduce_umin:
		case Intrinsic::vp_reduce_umax:
		return TypeTokenVec{VPTypeToken::Vector};

		case Intrinsic::vp_gather:
		case Intrinsic::vp_load:
		return TypeTokenVec{VPTypeToken::Returned, VPTypeToken::Pointer};

		case Intrinsic::vp_scatter:
		case Intrinsic::vp_store:
		return TypeTokenVec{VPTypeToken::Pointer, VPTypeToken::Vector};

		case Intrinsic::vp_fpext:
		case Intrinsic::vp_fptrunc:
		case Intrinsic::vp_fptoui:
		case Intrinsic::vp_fptosi:
		case Intrinsic::vp_sitofp:
		case Intrinsic::vp_uitofp:
		return TypeTokenVec{VPTypeToken::Returned, VPTypeToken::Vector};

		case Intrinsic::vp_icmp:
		case Intrinsic::vp_fcmp:
		return TypeTokenVec{VPTypeToken::Vector};
		}
		}

		bool VPIntrinsic::isReductionOp() const {
		return IsVPReduction(getIntrinsicID());
		}

		bool VPIntrinsic::IsVPReduction(Intrinsic::ID ID) {
		switch (ID) {
		default:
		return false;

		#define HANDLE_VP_REDUCTION(VPID, ACCUPOS, VECTORPOS) \
		case Intrinsic::VPID: \
		break;
		#include "llvm/IR/VPIntrinsics.def"
		}

		return true;
		}

		bool VPIntrinsic::isConstrainedOp() const {
		return (getRoundingMode() != None &&
		getRoundingMode() != fp::RoundingMode::rmToNearest) \|\|
		(getExceptionBehavior() != None &&
		getExceptionBehavior() != fp::ExceptionBehavior::ebIgnore);
		}

		bool VPIntrinsic::isUnaryOp() const { return IsUnaryVPOp(getIntrinsicID()); }

		bool VPIntrinsic::IsUnaryVPOp(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return false;

		#define HANDLE_VP_UNARYOP(VPID) \
		case Intrinsic::VPID: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool VPIntrinsic::isBinaryOp() const { return IsBinaryVPOp(getIntrinsicID()); }

		bool VPIntrinsic::IsBinaryVPOp(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return false;

		#define HANDLE_VP_IS_BINARY(VPID) \
		case Intrinsic::VPID: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool VPIntrinsic::isTernaryOp() const {
		return IsTernaryVPOp(getIntrinsicID());
		}

		bool VPIntrinsic::IsTernaryVPOp(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return false;

		#define HANDLE_VP_IS_TERNARY(VPID) \
		case Intrinsic::VPID: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool VPIntrinsic::isCompareOp() const {
		return IsCompareVPOp(getIntrinsicID());
		}

		bool VPIntrinsic::IsCompareVPOp(Intrinsic::ID VPID) {
		switch (VPID) {
		default:
		return false;

		#define HANDLE_VP_IS_XCMP(VPID) \
		case Intrinsic::VPID: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Optional<int>
		VPIntrinsic::GetExceptionBehaviorParamPos(Intrinsic::ID IntrinsicID) {
		switch (IntrinsicID) {
		default:
		return None;

		#define HANDLE_VP_FPCONSTRAINT(VPID, ROUNDPOS, EXCEPTPOS) \
		case Intrinsic::VPID: \
		return EXCEPTPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Optional<int> VPIntrinsic::GetRoundingModeParamPos(Intrinsic::ID IntrinsicID) {
		switch (IntrinsicID) {
		default:
		return None;

		#define HANDLE_VP_FPCONSTRAINT(VPID, ROUNDPOS, EXCEPTPOS) \
		case Intrinsic::VPID: \
		return ROUNDPOS;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		Intrinsic::ID VPIntrinsic::GetForIntrinsic(Intrinsic::ID IntrinsicID) {
		Optional<Intrinsic::ID> ConstrainedID;
		switch (IntrinsicID) {
		default:
		return Intrinsic::not_intrinsic;

		#define HANDLE_VP_TO_CONSTRAINED_INTRIN(VPID, CFPID) return Intrinsic::VPID;
		#define HANDLE_VP_TO_INTRIN(VPID, IID) return Intrinsic::VPID;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		VPIntrinsic::ShortTypeVec
		VPIntrinsic::EncodeTypeTokens(VPIntrinsic::TypeTokenVec TTVec, Type *VecRetTy,
		Type *VecPtrTy, Type &VectorTy) {
		ShortTypeVec STV;

		for (auto Token : TTVec) {
		switch (Token) {
		default:
		llvm_unreachable("unsupported token"); // unsupported VPTypeToken

		case VPIntrinsic::VPTypeToken::Vector:
		STV.push_back(&VectorTy);
		break;
		case VPIntrinsic::VPTypeToken::Pointer:
		STV.push_back(VecPtrTy);
		break;
		case VPIntrinsic::VPTypeToken::Returned:
		assert(VecRetTy);
		STV.push_back(VecRetTy);
		break;
		case VPIntrinsic::VPTypeToken::Mask:
		auto NumElems = VectorTy.getVectorNumElements();
		auto MaskTy =
		VectorType::get(Type::getInt1Ty(VectorTy.getContext()), NumElems);
		STV.push_back(MaskTy);
		break;
		}
		}

		return STV;
		}

Instruction::BinaryOps BinaryOpIntrinsic::getBinaryOp() const {		Instruction::BinaryOps BinaryOpIntrinsic::getBinaryOp() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
case Intrinsic::uadd_with_overflow:		case Intrinsic::uadd_with_overflow:
case Intrinsic::sadd_with_overflow:		case Intrinsic::sadd_with_overflow:
case Intrinsic::uadd_sat:		case Intrinsic::uadd_sat:
case Intrinsic::sadd_sat:		case Intrinsic::sadd_sat:
return Instruction::Add;		return Instruction::Add;
case Intrinsic::usub_with_overflow:		case Intrinsic::usub_with_overflow:
case Intrinsic::ssub_with_overflow:		case Intrinsic::ssub_with_overflow:
case Intrinsic::usub_sat:		case Intrinsic::usub_sat:
case Intrinsic::ssub_sat:		case Intrinsic::ssub_sat:
return Instruction::Sub;		return Instruction::Sub;
case Intrinsic::umul_with_overflow:		case Intrinsic::umul_with_overflow:
case Intrinsic::smul_with_overflow:		case Intrinsic::smul_with_overflow:
return Instruction::Mul;		return Instruction::Mul;
default:		default:
llvm_unreachable("Invalid intrinsic");		llvm_unreachable("Invalid intrinsic");
}		}
}		}

bool BinaryOpIntrinsic::isSigned() const {		bool BinaryOpIntrinsic::isSigned() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
case Intrinsic::sadd_with_overflow:		case Intrinsic::sadd_with_overflow:
case Intrinsic::ssub_with_overflow:		case Intrinsic::ssub_with_overflow:
case Intrinsic::smul_with_overflow:		case Intrinsic::smul_with_overflow:
case Intrinsic::sadd_sat:		case Intrinsic::sadd_sat:
case Intrinsic::ssub_sat:		case Intrinsic::ssub_sat:
return true;		return true;
default:		default:
return false;		return false;
}		}
}		}

unsigned BinaryOpIntrinsic::getNoWrapKind() const {		unsigned BinaryOpIntrinsic::getNoWrapKind() const {
if (isSigned())		if (isSigned())
return OverflowingBinaryOperator::NoSignedWrap;		return OverflowingBinaryOperator::NoSignedWrap;
else		else
return OverflowingBinaryOperator::NoUnsignedWrap;		return OverflowingBinaryOperator::NoUnsignedWrap;
}		}

llvm/lib/IR/PredicatedInst.cpp

This file was added.

				#include <llvm/IR/InstrTypes.h>
				#include <llvm/IR/Instruction.h>
				#include <llvm/IR/Instructions.h>
				#include <llvm/IR/IntrinsicInst.h>
				#include <llvm/IR/PredicatedInst.h>

				namespace {
				using namespace llvm;
				using ShortValueVec = SmallVector<Value *, 4>;
				} // namespace

				namespace llvm {

				bool PredicatedInstruction::canIgnoreVectorLengthParam() const {
				auto VPI = dyn_cast<VPIntrinsic>(this);
				if (!VPI)
				return true;

				return VPI->canIgnoreVectorLengthParam();
				}

				FastMathFlags PredicatedInstruction::getFastMathFlags() const {
				return cast<Instruction>(this)->getFastMathFlags();
				}

				void PredicatedOperator::copyIRFlags(const Value *V, bool IncludeWrapFlags) {
				auto *I = dyn_cast<Instruction>(this);
				if (I)
				I->copyIRFlags(V, IncludeWrapFlags);
				}

				bool
				PredicatedInstruction::isVectorReduction() const {
				auto VPI = dyn_cast<VPIntrinsic>(this);
				if (VPI) {
				return VPI->isReductionOp();
				}
				auto II = dyn_cast<IntrinsicInst>(this);
				if (!II) return false;

				switch (II->getIntrinsicID()) {
				default:
				return false;

				case Intrinsic::experimental_vector_reduce_add:
				case Intrinsic::experimental_vector_reduce_mul:
				case Intrinsic::experimental_vector_reduce_and:
				case Intrinsic::experimental_vector_reduce_or:
				case Intrinsic::experimental_vector_reduce_xor:
				case Intrinsic::experimental_vector_reduce_smin:
				case Intrinsic::experimental_vector_reduce_smax:
				case Intrinsic::experimental_vector_reduce_umin:
				case Intrinsic::experimental_vector_reduce_umax:
				case Intrinsic::experimental_vector_reduce_v2_fadd:
				case Intrinsic::experimental_vector_reduce_v2_fmul:
				case Intrinsic::experimental_vector_reduce_fmin:
				case Intrinsic::experimental_vector_reduce_fmax:
				return true;
				}
				}

				Instruction *PredicatedBinaryOperator::Create(
				Module Mod, Value Mask, Value *VectorLen, Instruction::BinaryOps Opc,
				Value V1, Value V2, const Twine &Name, BasicBlock *InsertAtEnd,
				Instruction *InsertBefore) {
				assert(!(InsertAtEnd && InsertBefore));
				auto VPID = VPIntrinsic::GetForOpcode(Opc);

				// Default Code Path
				if ((!Mod \|\| (!Mask && !VectorLen)) \|\| VPID == Intrinsic::not_intrinsic) {
				if (InsertAtEnd) {
				return BinaryOperator::Create(Opc, V1, V2, Name, InsertAtEnd);
				} else {
				return BinaryOperator::Create(Opc, V1, V2, Name, InsertBefore);
				}
				}

				assert(Mod && "Need a module to emit VP Intrinsics");

				// Fetch the VP intrinsic
				auto &VecTy = cast<VectorType>(*V1->getType());
				auto TypeTokens = VPIntrinsic::GetTypeTokens(VPID);
				auto *VPFunc = Intrinsic::getDeclaration(
				Mod, VPID,
				VPIntrinsic::EncodeTypeTokens(TypeTokens, &VecTy, nullptr, VecTy));

				// Encode default environment fp behavior
				LLVMContext &Ctx = V1->getContext();
				SmallVector<Value *, 6> BinOpArgs({V1, V2});
				if (VPIntrinsic::HasRoundingModeParam(VPID)) {
				BinOpArgs.push_back(
				GetConstrainedFPRounding(Ctx, fp::RoundingMode::rmToNearest));
				}
				if (VPIntrinsic::HasExceptionBehaviorParam(VPID)) {
				BinOpArgs.push_back(
				GetConstrainedFPExcept(Ctx, fp::ExceptionBehavior::ebIgnore));
				}

				BinOpArgs.push_back(Mask);
				BinOpArgs.push_back(VectorLen);

				CallInst *CI;
				if (InsertAtEnd) {
				CI = CallInst::Create(VPFunc, BinOpArgs, Name, InsertAtEnd);
				} else {
				CI = CallInst::Create(VPFunc, BinOpArgs, Name, InsertBefore);
				}

				// the VP inst does not touch memory if the exception behavior is
				// "fpecept.ignore"
				CI->setDoesNotAccessMemory();
				return CI;
				}

				} // namespace llvm

llvm/lib/IR/VPBuilder.cpp

This file was added.

				#include <llvm/ADT/SmallVector.h>
				#include <llvm/IR/FPEnv.h>
				#include <llvm/IR/Instructions.h>
				#include <llvm/IR/Intrinsics.h>
				#include <llvm/IR/PredicatedInst.h>
				#include <llvm/IR/VPBuilder.h>

				namespace {
				using namespace llvm;
				using ShortTypeVec = VPIntrinsic::ShortTypeVec;
				using ShortValueVec = SmallVector<Value *, 4>;
				} // namespace

				namespace llvm {

				Module &VPBuilder::getModule() const {
				return *Builder.GetInsertBlock()->getParent()->getParent();
				}

				Value &VPBuilder::RequestPred() {
				if (Mask)
				return *Mask;

				auto *boolTy = Builder.getInt1Ty();
				auto *maskTy = VectorType::get(boolTy, StaticVectorLength);
				return *ConstantInt::getAllOnesValue(maskTy);
				}

				Value &VPBuilder::RequestEVL() {
				if (ExplicitVectorLength)
				return *ExplicitVectorLength;

				auto *intTy = Builder.getInt32Ty();
				return *ConstantInt::get(intTy, StaticVectorLength);
				}

				Value *VPBuilder::CreateVectorCopy(Instruction &Inst, ValArray VecOpArray) {
				auto OC = Inst.getOpcode();
				auto VPID = VPIntrinsic::GetForOpcode(OC);
				if (VPID == Intrinsic::not_intrinsic) {
				return nullptr;
				}

				Optional<int> MaskPosOpt = VPIntrinsic::GetMaskParamPos(VPID);
				Optional<int> VLenPosOpt = VPIntrinsic::GetVectorLengthParamPos(VPID);
				Optional<int> FPRoundPosOpt = VPIntrinsic::GetRoundingModeParamPos(VPID);
				Optional<int> FPExceptPosOpt =
				VPIntrinsic::GetExceptionBehaviorParamPos(VPID);

				Optional<int> CmpPredPos = None;
				if (isa<CmpInst>(Inst)) {
				CmpPredPos = 2;
				}

				// TODO transfer alignment

				// construct VP vector operands (including pred and evl)
				SmallVector<Value *, 6> VecParams;
				for (size_t i = 0; i < Inst.getNumOperands() + 5; ++i) {
				if (MaskPosOpt && (i == (size_t)MaskPosOpt.getValue())) {
				// First operand of select is mask (singular exception)
				if (VPID != Intrinsic::vp_select)
				VecParams.push_back(&RequestPred());
				}
				if (VLenPosOpt && (i == (size_t)VLenPosOpt.getValue())) {
				VecParams.push_back(&RequestEVL());
				}
				if (FPRoundPosOpt && (i == (size_t)FPRoundPosOpt.getValue())) {
				// TODO decode fp env from constrained intrinsics
				VecParams.push_back(GetConstrainedFPRounding(
				Builder.getContext(), fp::RoundingMode::rmToNearest));
				}
				if (FPExceptPosOpt && (i == (size_t)FPExceptPosOpt.getValue())) {
				// TODO decode fp env from constrained intrinsics
				VecParams.push_back(GetConstrainedFPExcept(
				Builder.getContext(), fp::ExceptionBehavior::ebIgnore));
				}
				if (CmpPredPos && (i == (size_t)CmpPredPos.getValue())) {
				auto &CmpI = cast<CmpInst>(Inst);
				VecParams.push_back(ConstantInt::get(
				Type::getInt8Ty(Builder.getContext()), CmpI.getPredicate()));
				}
				if (i < VecOpArray.size())
				VecParams.push_back(VecOpArray[i]);
				}

				Type *ScaRetTy = Inst.getType();
				Type VecRetTy = ScaRetTy->isVoidTy() ? ScaRetTy : &getVectorType(ScaRetTy);
				auto &M = *Builder.GetInsertBlock()->getParent()->getParent();
				auto VPDecl =
				VPIntrinsic::GetDeclarationForParams(&M, VPID, VecParams, VecRetTy);

				return Builder.CreateCall(VPDecl, VecParams, Inst.getName() + ".vp");
				}

				VectorType &VPBuilder::getVectorType(Type &ElementTy) {
				return *VectorType::get(&ElementTy, StaticVectorLength);
				}

				Value &VPBuilder::CreateContiguousStore(Value &Val, Value &ElemPointer,
				MaybeAlign AlignOpt) {
				auto &PointerTy = cast<PointerType>(*ElemPointer.getType());
				auto &VecTy = getVectorType(*PointerTy.getPointerElementType());
				auto *VecPtrTy = VecTy.getPointerTo(PointerTy.getAddressSpace());
				auto *VecPtr = Builder.CreatePointerCast(&ElemPointer, VecPtrTy);

				auto *StoreFunc = Intrinsic::getDeclaration(&getModule(), Intrinsic::vp_store,
				{&VecTy, VecPtrTy});
				ShortValueVec Args{&Val, VecPtr, &RequestPred(), &RequestEVL()};
				CallInst &StoreCall = *Builder.CreateCall(StoreFunc, Args);
				if (AlignOpt.hasValue()) {
				unsigned PtrPos =
				VPIntrinsic::GetMemoryPointerParamPos(Intrinsic::vp_store).getValue();
				StoreCall.addParamAttr(
				PtrPos, Attribute::getWithAlignment(getContext(), AlignOpt.getValue()));
				}
				return StoreCall;
				}

				Value &VPBuilder::CreateContiguousLoad(Value &ElemPointer,
				MaybeAlign AlignOpt) {
				auto &PointerTy = cast<PointerType>(*ElemPointer.getType());
				auto &VecTy = getVectorType(*PointerTy.getPointerElementType());
				auto *VecPtrTy = VecTy.getPointerTo(PointerTy.getAddressSpace());
				auto *VecPtr = Builder.CreatePointerCast(&ElemPointer, VecPtrTy);

				auto *LoadFunc = Intrinsic::getDeclaration(&getModule(), Intrinsic::vp_load,
				{&VecTy, VecPtrTy});
				ShortValueVec Args{VecPtr, &RequestPred(), &RequestEVL()};
				CallInst &LoadCall = *Builder.CreateCall(LoadFunc, Args);
				if (AlignOpt.hasValue()) {
				unsigned PtrPos =
				VPIntrinsic::GetMemoryPointerParamPos(Intrinsic::vp_load).getValue();
				LoadCall.addParamAttr(
				PtrPos, Attribute::getWithAlignment(getContext(), AlignOpt.getValue()));
				}
				return LoadCall;
				}

				Value &VPBuilder::CreateScatter(Value &Val, Value &PointerVec,
				MaybeAlign AlignOpt) {
				auto *ScatterFunc =
				Intrinsic::getDeclaration(&getModule(), Intrinsic::vp_scatter,
				{Val.getType(), PointerVec.getType()});
				ShortValueVec Args{&Val, &PointerVec, &RequestPred(), &RequestEVL()};
				CallInst &ScatterCall = *Builder.CreateCall(ScatterFunc, Args);
				if (AlignOpt.hasValue()) {
				unsigned PtrPos =
				VPIntrinsic::GetMemoryPointerParamPos(Intrinsic::vp_scatter).getValue();
				ScatterCall.addParamAttr(
				PtrPos, Attribute::getWithAlignment(getContext(), AlignOpt.getValue()));
				}
				return ScatterCall;
				}

				Value &VPBuilder::CreateGather(Value &PointerVec, MaybeAlign AlignOpt) {
				auto &PointerVecTy = cast<VectorType>(*PointerVec.getType());
				auto &ElemTy = cast<PointerType>(PointerVecTy.getVectorElementType())
				.getPointerElementType();
				auto &VecTy = *VectorType::get(&ElemTy, PointerVecTy.getNumElements());
				auto *GatherFunc = Intrinsic::getDeclaration(
				&getModule(), Intrinsic::vp_gather, {&VecTy, &PointerVecTy});

				ShortValueVec Args{&PointerVec, &RequestPred(), &RequestEVL()};
				CallInst &GatherCall = *Builder.CreateCall(GatherFunc, Args);
				if (AlignOpt.hasValue()) {
				unsigned PtrPos =
				VPIntrinsic::GetMemoryPointerParamPos(Intrinsic::vp_gather).getValue();
				GatherCall.addParamAttr(
				PtrPos, Attribute::getWithAlignment(getContext(), AlignOpt.getValue()));
				}
				return GatherCall;
				}

				Value VPBuilder::CreateVectorShift(Value SrcVal, Value *Amount, Twine Name) {
				auto D = VPIntrinsic::GetDeclarationForParams(
				&getModule(), Intrinsic::vp_vshift, {SrcVal, Amount});
				return Builder.CreateCall(D, {SrcVal, Amount, &RequestPred(), &RequestEVL()},
				Name);
				}

				} // namespace llvm

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/ConstantRange.h"		#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfo.h"		#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/FPEnv.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/FPEnv.h" ```
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalAlias.h"		#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InlineAsm.h"		#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/InstVisitor.h"		#include "llvm/IR/InstVisitor.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/IntrinsicsWebAssembly.h"		#include "llvm/IR/IntrinsicsWebAssembly.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/ModuleSlotTracker.h"		#include "llvm/IR/ModuleSlotTracker.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/IR/Statepoint.h"		#include "llvm/IR/Statepoint.h"
		#include "llvm/IR/FPEnv.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/FPEnv.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/FPEnv.h" ```
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Use.h"		#include "llvm/IR/Use.h"
#include "llvm/IR/User.h"		#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/AtomicOrdering.h"		#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
▲ Show 20 Lines • Show All 368 Lines • ▼ Show 20 Lines	#include "llvm/IR/Metadata.def"
void visitSwitchInst(SwitchInst &SI);		void visitSwitchInst(SwitchInst &SI);
void visitIndirectBrInst(IndirectBrInst &BI);		void visitIndirectBrInst(IndirectBrInst &BI);
void visitCallBrInst(CallBrInst &CBI);		void visitCallBrInst(CallBrInst &CBI);
void visitSelectInst(SelectInst &SI);		void visitSelectInst(SelectInst &SI);
void visitUserOp1(Instruction &I);		void visitUserOp1(Instruction &I);
void visitUserOp2(Instruction &I) { visitUserOp1(I); }		void visitUserOp2(Instruction &I) { visitUserOp1(I); }
void visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call);		void visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call);
void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI);		void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI);
		void visitVPIntrinsic(VPIntrinsic &FPI);
void visitDbgIntrinsic(StringRef Kind, DbgVariableIntrinsic &DII);		void visitDbgIntrinsic(StringRef Kind, DbgVariableIntrinsic &DII);
void visitDbgLabelIntrinsic(StringRef Kind, DbgLabelInst &DLI);		void visitDbgLabelIntrinsic(StringRef Kind, DbgLabelInst &DLI);
void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);		void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);
void visitAtomicRMWInst(AtomicRMWInst &RMWI);		void visitAtomicRMWInst(AtomicRMWInst &RMWI);
void visitFenceInst(FenceInst &FI);		void visitFenceInst(FenceInst &FI);
void visitAllocaInst(AllocaInst &AI);		void visitAllocaInst(AllocaInst &AI);
void visitExtractValueInst(ExtractValueInst &EVI);		void visitExtractValueInst(ExtractValueInst &EVI);
void visitInsertValueInst(InsertValueInst &IVI);		void visitInsertValueInst(InsertValueInst &IVI);
▲ Show 20 Lines • Show All 1,209 Lines • ▼ Show 20 Lines

// Check parameter attributes against a function type.		// Check parameter attributes against a function type.
// The value V is printed in error messages.		// The value V is printed in error messages.
void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs,		void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs,
const Value *V, bool IsIntrinsic) {		const Value *V, bool IsIntrinsic) {
if (Attrs.isEmpty())		if (Attrs.isEmpty())
return;		return;

		bool SawMask = false;
bool SawNest = false;		bool SawNest = false;
		bool SawPassthru = false;
bool SawReturned = false;		bool SawReturned = false;
bool SawSRet = false;		bool SawSRet = false;
bool SawSwiftSelf = false;		bool SawSwiftSelf = false;
bool SawSwiftError = false;		bool SawSwiftError = false;
		bool SawVectorLength = false;

// Verify return value attributes.		// Verify return value attributes.
AttributeSet RetAttrs = Attrs.getRetAttributes();		AttributeSet RetAttrs = Attrs.getRetAttributes();
Assert((!RetAttrs.hasAttribute(Attribute::ByVal) &&		Assert((!RetAttrs.hasAttribute(Attribute::ByVal) &&
!RetAttrs.hasAttribute(Attribute::Nest) &&		!RetAttrs.hasAttribute(Attribute::Nest) &&
!RetAttrs.hasAttribute(Attribute::StructRet) &&		!RetAttrs.hasAttribute(Attribute::StructRet) &&
!RetAttrs.hasAttribute(Attribute::NoCapture) &&		!RetAttrs.hasAttribute(Attribute::NoCapture) &&
!RetAttrs.hasAttribute(Attribute::NoFree) &&		!RetAttrs.hasAttribute(Attribute::NoFree) &&
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = FT->getNumParams(); i != e; ++i) {
}		}

if (ArgAttrs.hasAttribute(Attribute::SwiftError)) {		if (ArgAttrs.hasAttribute(Attribute::SwiftError)) {
Assert(!SawSwiftError, "Cannot have multiple 'swifterror' parameters!",		Assert(!SawSwiftError, "Cannot have multiple 'swifterror' parameters!",
V);		V);
SawSwiftError = true;		SawSwiftError = true;
}		}

		if (ArgAttrs.hasAttribute(Attribute::VectorLength)) {
		Assert(!SawVectorLength, "Cannot have multiple 'vlen' parameters!",
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Assert(!SawVectorLength, "Cannot have multiple 'vlen' parameters!", - V); + Assert(!SawVectorLength, "Cannot have multiple 'vlen' parameters!", V); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Assert(!SawVectorLength, "Cannot have…
		V);
		SawVectorLength = true;
		}

		if (ArgAttrs.hasAttribute(Attribute::Passthru)) {
		Assert(!SawPassthru, "Cannot have multiple 'passthru' parameters!",
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Assert(!SawPassthru, "Cannot have multiple 'passthru' parameters!", - V); + Assert(!SawPassthru, "Cannot have multiple 'passthru' parameters!", V); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Assert(!SawPassthru, "Cannot have multiple…
		V);
		SawPassthru = true;
		}

		if (ArgAttrs.hasAttribute(Attribute::Mask)) {
		Assert(!SawMask, "Cannot have multiple 'mask' parameters!",
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Assert(!SawMask, "Cannot have multiple 'mask' parameters!", - V); + Assert(!SawMask, "Cannot have multiple 'mask' parameters!", V); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Assert(!SawMask, "Cannot have multiple 'mask'…
		V);
		SawMask = true;
		}

if (ArgAttrs.hasAttribute(Attribute::InAlloca)) {		if (ArgAttrs.hasAttribute(Attribute::InAlloca)) {
Assert(i == FT->getNumParams() - 1,		Assert(i == FT->getNumParams() - 1,
"inalloca isn't on the last parameter!", V);		"inalloca isn't on the last parameter!", V);
}		}
}		}

		Assert(!SawPassthru \|\| SawMask,
		"Cannot have 'passthru' parameter without 'mask' parameter!", V);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - "Cannot have 'passthru' parameter without 'mask' parameter!", V); + "Cannot have 'passthru' parameter without 'mask' parameter!", V); Lint: Pre-merge checks: clang-format: please reformat the code ``` - "Cannot have 'passthru' parameter without…

if (!Attrs.hasAttributes(AttributeList::FunctionIndex))		if (!Attrs.hasAttributes(AttributeList::FunctionIndex))
return;		return;

verifyAttributeTypes(Attrs.getFnAttributes(), /IsFunction=/true, V);		verifyAttributeTypes(Attrs.getFnAttributes(), /IsFunction=/true, V);

Assert(!(Attrs.hasFnAttribute(Attribute::ReadNone) &&		Assert(!(Attrs.hasFnAttribute(Attribute::ReadNone) &&
Attrs.hasFnAttribute(Attribute::ReadOnly)),		Attrs.hasFnAttribute(Attribute::ReadOnly)),
"Attributes 'readnone and readonly' are incompatible!", V);		"Attributes 'readnone and readonly' are incompatible!", V);
▲ Show 20 Lines • Show All 2,546 Lines • ▼ Show 20 Lines	Assert(isa<ConstantStruct>(Init) \|\| isa<ConstantArray>(Init),
"an array");		"an array");
break;		break;
}		}
#define INSTRUCTION(NAME, NARGS, ROUND_MODE, INTRINSIC) \		#define INSTRUCTION(NAME, NARGS, ROUND_MODE, INTRINSIC) \
case Intrinsic::INTRINSIC:		case Intrinsic::INTRINSIC:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(Call));		visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(Call));
break;		break;

		#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS) \
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS) \ - case Intrinsic::VPID: +#define REGISTER_VP_INTRINSIC(VPID, MASKPOS, VLENPOS) case Intrinsic::VPID: Lint: Pre-merge checks: clang-format: please reformat the code ``` -#define REGISTER_VP_INTRINSIC(VPID,MASKPOS,VLENPOS)…
		case Intrinsic::VPID:
		#include "llvm/IR/VPIntrinsics.def"
		visitVPIntrinsic(cast<VPIntrinsic>(Call));
		break;

case Intrinsic::dbg_declare: // llvm.dbg.declare		case Intrinsic::dbg_declare: // llvm.dbg.declare
Assert(isa<MetadataAsValue>(Call.getArgOperand(0)),		Assert(isa<MetadataAsValue>(Call.getArgOperand(0)),
"invalid llvm.dbg.declare intrinsic call 1", Call);		"invalid llvm.dbg.declare intrinsic call 1", Call);
visitDbgIntrinsic("declare", cast<DbgVariableIntrinsic>(Call));		visitDbgIntrinsic("declare", cast<DbgVariableIntrinsic>(Call));
break;		break;
case Intrinsic::dbg_addr: // llvm.dbg.addr		case Intrinsic::dbg_addr: // llvm.dbg.addr
visitDbgIntrinsic("addr", cast<DbgVariableIntrinsic>(Call));		visitDbgIntrinsic("addr", cast<DbgVariableIntrinsic>(Call));
break;		break;
▲ Show 20 Lines • Show All 429 Lines • ▼ Show 20 Lines	static DISubprogram getSubprogram(Metadata LocalScope) {
if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))		if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))
return getSubprogram(LB->getRawScope());		return getSubprogram(LB->getRawScope());

// Just return null; broken scope chains are checked elsewhere.		// Just return null; broken scope chains are checked elsewhere.
assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");		assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");
return nullptr;		return nullptr;
}		}

		void Verifier::visitVPIntrinsic(VPIntrinsic &VPI) {
		Assert(!VPI.isConstrainedOp(),
		"VP intrinsics only support the default fp environment for now "
		"(round.tonearest; fpexcept.ignore).");
		if (VPI.isConstrainedOp()) {
		Assert(VPI.getExceptionBehavior() != None,
		"invalid exception behavior argument", &VPI);
		Assert(VPI.getRoundingMode() != None, "invalid rounding mode argument",
		&VPI);
		}
		}

void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {		void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
unsigned NumOperands;		unsigned NumOperands;
bool HasRoundingMD;		bool HasRoundingMD;
switch (FPI.getIntrinsicID()) {		switch (FPI.getIntrinsicID()) {
#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \		#define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \
case Intrinsic::INTRINSIC: \		case Intrinsic::INTRINSIC: \
NumOperands = NARG; \		NumOperands = NARG; \
HasRoundingMD = ROUND_MODE; \		HasRoundingMD = ROUND_MODE; \
▲ Show 20 Lines • Show All 784 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show All 15 Lines
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/MatcherCast.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/MatcherCast.h" ```
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
		#include "llvm/IR/PredicatedInst.h"
		#include "llvm/IR/VPBuilder.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/VPBuilder.h" -#include "llvm/IR/MatcherCast.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/VPBuilder.h" -#include…
		#include "llvm/IR/MatcherCast.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/VPBuilder.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/VPBuilder.h" ```
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/AlignOf.h"		#include "llvm/Support/AlignOf.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include <cassert>		#include <cassert>
#include <utility>		#include <utility>

using namespace llvm;		using namespace llvm;
▲ Show 20 Lines • Show All 2,082 Lines • ▼ Show 20 Lines	if (I.hasNoSignedZeros() &&
return BinaryOperator::CreateFSubFMF(Y, X, &I);		return BinaryOperator::CreateFSubFMF(Y, X, &I);

if (Instruction *R = hoistFNegAboveFMulFDiv(I, Builder))		if (Instruction *R = hoistFNegAboveFMulFDiv(I, Builder))
return R;		return R;

return nullptr;		return nullptr;
}		}

		Instruction *InstCombiner::visitPredicatedFSub(PredicatedBinaryOperator& I) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Instruction InstCombiner::visitPredicatedFSub(PredicatedBinaryOperator& I) { - auto Inst = cast<Instruction>(&I); +Instruction InstCombiner::visitPredicatedFSub(PredicatedBinaryOperator &I) { + auto Inst = cast<Instruction>(&I); Lint: Pre-merge checks: clang-format: please reformat the code ``` -Instruction *InstCombiner::visitPredicatedFSub…
		auto * Inst = cast<Instruction>(&I);
		PredicatedContext PC(&I);
		if (Value *V = SimplifyPredicatedFSubInst(I.getOperand(0), I.getOperand(1),
		I.getFastMathFlags(),
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - I.getFastMathFlags(), - SQ.getWithInstruction(Inst), PC)) + I.getFastMathFlags(), + SQ.getWithInstruction(Inst), PC)) Lint: Pre-merge checks: clang-format: please reformat the code ``` - I.
		SQ.getWithInstruction(Inst), PC))
		return replaceInstUsesWith(*Inst, V);

		return visitFSubGeneric<Instruction, PredicatedContext>(*Inst);
		}

Instruction *InstCombiner::visitFSub(BinaryOperator &I) {		Instruction *InstCombiner::visitFSub(BinaryOperator &I) {
if (Value *V = SimplifyFSubInst(I.getOperand(0), I.getOperand(1),		if (Value *V = SimplifyFSubInst(I.getOperand(0), I.getOperand(1),
I.getFastMathFlags(),		I.getFastMathFlags(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Instruction *X = foldVectorBinop(I))		if (Instruction *X = foldVectorBinop(I))
return X;		return X;

		return visitFSubGeneric<BinaryOperator, EmptyContext>(I);
		}

		template<typename BinaryOpTy, typename MatchContextType>
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<typename BinaryOpTy, typename MatchContextType> +template <typename BinaryOpTy, typename MatchContextType> Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<typename BinaryOpTy, typename…
		Instruction *InstCombiner::visitFSubGeneric(BinaryOpTy &I) {
		MatchContextType MC(cast<Value>(&I));
		MatchContextBuilder<MatchContextType> MCBuilder(MC);

// Subtraction from -0.0 is the canonical form of fneg.		// Subtraction from -0.0 is the canonical form of fneg.
// fsub nsz 0, X ==> fsub nsz -0.0, X		// fsub nsz 0, X ==> fsub nsz -0.0, X
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
if (I.hasNoSignedZeros() && match(Op0, m_PosZeroFP()))		if (I.hasNoSignedZeros() && MC.try_match(Op0, m_PosZeroFP()))
return BinaryOperator::CreateFNegFMF(Op1, &I);		return MCBuilder.CreateFNegFMF(Op1, &I);

if (Instruction *X = foldFNegIntoConstant(I))		if (Instruction *X = foldFNegIntoConstant(I))
return X;		return X;

if (Instruction *R = hoistFNegAboveFMulFDiv(I, Builder))		if (Instruction *R = hoistFNegAboveFMulFDiv(I, Builder))
return R;		return R;

Value X, Y;		Value X, Y;
Constant *C;		Constant *C;

		// Fold negation into constant operand. This is limited with one-use because
		// fneg is assumed better for analysis and cheaper in codegen than fmul/fdiv.
		// -(X * C) --> X * (-C)
		if (MC.try_match(&I, m_FNeg(m_OneUse(m_FMul(m_Value(X), m_Constant(C))))))
		return MCBuilder.CreateFMulFMF(X, ConstantExpr::getFNeg(C), &I);
		// -(X / C) --> X / (-C)
		if (MC.try_match(&I, m_FNeg(m_OneUse(m_FDiv(m_Value(X), m_Constant(C))))))
		return MCBuilder.CreateFDivFMF(X, ConstantExpr::getFNeg(C), &I);
		// -(C / X) --> (-C) / X
		if (MC.try_match(&I, m_FNeg(m_OneUse(m_FDiv(m_Constant(C), m_Value(X))))))
		return MCBuilder.CreateFDivFMF(ConstantExpr::getFNeg(C), X, &I);

// If Op0 is not -0.0 or we can ignore -0.0: Z - (X - Y) --> Z + (Y - X)		// If Op0 is not -0.0 or we can ignore -0.0: Z - (X - Y) --> Z + (Y - X)
// Canonicalize to fadd to make analysis easier.		// Canonicalize to fadd to make analysis easier.
// This can also help codegen because fadd is commutative.		// This can also help codegen because fadd is commutative.
// Note that if this fsub was really an fneg, the fadd with -0.0 will get		// Note that if this fsub was really an fneg, the fadd with -0.0 will get
// killed later. We still limit that particular transform with 'hasOneUse'		// killed later. We still limit that particular transform with 'hasOneUse'
// because an fneg is assumed better/cheaper than a generic fsub.		// because an fneg is assumed better/cheaper than a generic fsub.
if (I.hasNoSignedZeros() \|\| CannotBeNegativeZero(Op0, SQ.TLI)) {		if (I.hasNoSignedZeros() \|\| CannotBeNegativeZero(Op0, SQ.TLI)) {
if (match(Op1, m_OneUse(m_FSub(m_Value(X), m_Value(Y))))) {		if (MC.try_match(Op1, m_OneUse(m_FSub(m_Value(X), m_Value(Y))))) {
Value *NewSub = Builder.CreateFSubFMF(Y, X, &I);		Value *NewSub = MCBuilder.CreateFSubFMF(Builder, Y, X, &I);
return BinaryOperator::CreateFAddFMF(Op0, NewSub, &I);		return MCBuilder.CreateFAddFMF(Op0, NewSub, &I);
}		}
}		}

// (-X) - Op1 --> -(X + Op1)		// (-X) - Op1 --> -(X + Op1)
if (I.hasNoSignedZeros() && !isa<ConstantExpr>(Op0) &&		if (I.hasNoSignedZeros() && !isa<ConstantExpr>(Op0) &&
match(Op0, m_OneUse(m_FNeg(m_Value(X))))) {		MC.try_match(Op0, m_OneUse(m_FNeg(m_Value(X))))) {
Value *FAdd = Builder.CreateFAddFMF(X, Op1, &I);		Value *FAdd = MCBuilder.CreateFAddFMF(Builder, X, Op1, &I);
return UnaryOperator::CreateFNegFMF(FAdd, &I);		return MCBuilder.CreateFNegFMF(FAdd, &I);
}		}

if (isa<Constant>(Op0))		if (isa<Constant>(Op0))
if (SelectInst *SI = dyn_cast<SelectInst>(Op1))		if (SelectInst *SI = dyn_cast<SelectInst>(Op1))
if (Instruction *NV = FoldOpIntoSelect(I, SI))		if (Instruction *NV = FoldOpIntoSelect(I, SI))
return NV;		return NV;

// X - C --> X + (-C)		// X - C --> X + (-C)
// But don't transform constant expressions because there's an inverse fold		// But don't transform constant expressions because there's an inverse fold
// for X + (-Y) --> X - Y.		// for X + (-Y) --> X - Y.
if (match(Op1, m_Constant(C)) && !isa<ConstantExpr>(Op1))		if (MC.try_match(Op1, m_Constant(C)) && !isa<ConstantExpr>(Op1))
return BinaryOperator::CreateFAddFMF(Op0, ConstantExpr::getFNeg(C), &I);		return MCBuilder.CreateFAddFMF(Op0, ConstantExpr::getFNeg(C), &I);

// X - (-Y) --> X + Y		// X - (-Y) --> X + Y
if (match(Op1, m_FNeg(m_Value(Y))))		if (MC.try_match(Op1, m_FNeg(m_Value(Y))))
return BinaryOperator::CreateFAddFMF(Op0, Y, &I);		return MCBuilder.CreateFAddFMF(Op0, Y, &I);

// Similar to above, but look through a cast of the negated value:		// Similar to above, but look through a cast of the negated value:
// X - (fptrunc(-Y)) --> X + fptrunc(Y)		// X - (fptrunc(-Y)) --> X + fptrunc(Y)
Type *Ty = I.getType();		Type *Ty = I.getType();
if (match(Op1, m_OneUse(m_FPTrunc(m_FNeg(m_Value(Y))))))		if (MC.try_match(Op1, m_OneUse(m_FPTrunc(m_FNeg(m_Value(Y))))))
return BinaryOperator::CreateFAddFMF(Op0, Builder.CreateFPTrunc(Y, Ty), &I);		return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPTrunc(Builder, Y, Ty), &I);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPTrunc(Builder, Y, Ty), &I); + return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPTrunc(Builder, Y, Ty), + &I); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return MCBuilder.CreateFAddFMF(Op0, MCBuilder.

// X - (fpext(-Y)) --> X + fpext(Y)		// X - (fpext(-Y)) --> X + fpext(Y)
if (match(Op1, m_OneUse(m_FPExt(m_FNeg(m_Value(Y))))))		if (MC.try_match(Op1, m_OneUse(m_FPExt(m_FNeg(m_Value(Y))))))
return BinaryOperator::CreateFAddFMF(Op0, Builder.CreateFPExt(Y, Ty), &I);		return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPExt(Builder, Y, Ty), &I);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPExt(Builder, Y, Ty), &I); + return MCBuilder.CreateFAddFMF(Op0, MCBuilder.CreateFPExt(Builder, Y, Ty), + &I); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return MCBuilder.CreateFAddFMF(Op0, MCBuilder.

// Similar to above, but look through fmul/fdiv of the negated value:		// Similar to above, but look through fmul/fdiv of the negated value:
// Op0 - (-X * Y) --> Op0 + (X * Y)		// Op0 - (-X * Y) --> Op0 + (X * Y)
// Op0 - (Y * -X) --> Op0 + (X * Y)		// Op0 - (Y * -X) --> Op0 + (X * Y)
if (match(Op1, m_OneUse(m_c_FMul(m_FNeg(m_Value(X)), m_Value(Y))))) {		if (match(Op1, m_OneUse(m_c_FMul(m_FNeg(m_Value(X)), m_Value(Y))))) {
Value *FMul = Builder.CreateFMulFMF(X, Y, &I);		Value *FMul = Builder.CreateFMulFMF(X, Y, &I);
return BinaryOperator::CreateFAddFMF(Op0, FMul, &I);		return BinaryOperator::CreateFAddFMF(Op0, FMul, &I);
}		}
// Op0 - (-X / Y) --> Op0 + (X / Y)		// Op0 - (-X / Y) --> Op0 + (X / Y)
// Op0 - (X / -Y) --> Op0 + (X / Y)		// Op0 - (X / -Y) --> Op0 + (X / Y)
if (match(Op1, m_OneUse(m_FDiv(m_FNeg(m_Value(X)), m_Value(Y)))) \|\|		if (match(Op1, m_OneUse(m_FDiv(m_FNeg(m_Value(X)), m_Value(Y)))) \|\|
match(Op1, m_OneUse(m_FDiv(m_Value(X), m_FNeg(m_Value(Y)))))) {		match(Op1, m_OneUse(m_FDiv(m_Value(X), m_FNeg(m_Value(Y)))))) {
Value *FDiv = Builder.CreateFDivFMF(X, Y, &I);		Value *FDiv = Builder.CreateFDivFMF(X, Y, &I);
return BinaryOperator::CreateFAddFMF(Op0, FDiv, &I);		return BinaryOperator::CreateFAddFMF(Op0, FDiv, &I);
}		}

// Handle special cases for FSub with selects feeding the operation		// Handle special cases for FSub with selects feeding the operation
if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))		if (auto * PlainBinOp = dyn_cast<BinaryOperator>(&I))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (auto * PlainBinOp = dyn_cast<BinaryOperator>(&I)) + if (auto PlainBinOp = dyn_cast<BinaryOperator>(&I)) Lint: Pre-merge checks:* clang-format: please reformat the code ``` - if (auto * PlainBinOp = dyn_cast<BinaryOperator>…
		if (Value V = SimplifySelectsFeedingBinaryOp(PlainBinOp, Op0, Op1))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (I.hasAllowReassoc() && I.hasNoSignedZeros()) {		if (I.hasAllowReassoc() && I.hasNoSignedZeros()) {
// (Y - X) - Y --> -X		// (Y - X) - Y --> -X
if (match(Op0, m_FSub(m_Specific(Op1), m_Value(X))))		if (MC.try_match(Op0, m_FSub(m_Specific(Op1), m_Value(X))))
return BinaryOperator::CreateFNegFMF(X, &I);		return MCBuilder.CreateFNegFMF(X, &I);

// Y - (X + Y) --> -X		// Y - (X + Y) --> -X
// Y - (Y + X) --> -X		// Y - (Y + X) --> -X
if (match(Op1, m_c_FAdd(m_Specific(Op0), m_Value(X))))		if (MC.try_match(Op1, m_c_FAdd(m_Specific(Op0), m_Value(X))))
return BinaryOperator::CreateFNegFMF(X, &I);		return MCBuilder.CreateFNegFMF(X, &I);

// (X * C) - X --> X * (C - 1.0)		// (X * C) - X --> X * (C - 1.0)
if (match(Op0, m_FMul(m_Specific(Op1), m_Constant(C)))) {		if (MC.try_match(Op0, m_FMul(m_Specific(Op1), m_Constant(C)))) {
Constant *CSubOne = ConstantExpr::getFSub(C, ConstantFP::get(Ty, 1.0));		Constant *CSubOne = ConstantExpr::getFSub(C, ConstantFP::get(Ty, 1.0));
return BinaryOperator::CreateFMulFMF(Op1, CSubOne, &I);		return MCBuilder.CreateFMulFMF(Op1, CSubOne, &I);
}		}
// X - (X * C) --> X * (1.0 - C)		// X - (X * C) --> X * (1.0 - C)
if (match(Op1, m_FMul(m_Specific(Op0), m_Constant(C)))) {		if (MC.try_match(Op1, m_FMul(m_Specific(Op0), m_Constant(C)))) {
Constant *OneSubC = ConstantExpr::getFSub(ConstantFP::get(Ty, 1.0), C);		Constant *OneSubC = ConstantExpr::getFSub(ConstantFP::get(Ty, 1.0), C);
return BinaryOperator::CreateFMulFMF(Op0, OneSubC, &I);		return MCBuilder.CreateFMulFMF(Op0, OneSubC, &I);
}		}

if (Instruction *F = factorizeFAddFSub(I, Builder))		if (auto * PlainBinOp = dyn_cast<BinaryOperator>(&I)) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (auto * PlainBinOp = dyn_cast<BinaryOperator>(&I)) { + if (auto PlainBinOp = dyn_cast<BinaryOperator>(&I)) { Lint: Pre-merge checks:* clang-format: please reformat the code ``` - if (auto * PlainBinOp =…
		if (Instruction F = factorizeFAddFSub(PlainBinOp, Builder))
return F;		return F;
		}

// TODO: This performs reassociative folds for FP ops. Some fraction of the		// TODO: This performs reassociative folds for FP ops. Some fraction of the
// functionality has been subsumed by simple pattern matching here and in		// functionality has been subsumed by simple pattern matching here and in
// InstSimplify. We should let a dedicated reassociation pass handle more		// InstSimplify. We should let a dedicated reassociation pass handle more
// complex pattern matching and remove this from InstCombine.		// complex pattern matching and remove this from InstCombine.
if (Value *V = FAddCombine(Builder).simplify(&I))		if (Value *V = FAddCombine(Builder).simplify(&I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

Show All 9 Lines

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show All 33 Lines
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
		#include "llvm/IR/PredicatedInst.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/PredicatedInst.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/PredicatedInst.h" ```
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/IntrinsicsX86.h"		#include "llvm/IR/IntrinsicsX86.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/IntrinsicsX86.h" -#include "llvm/IR/IntrinsicsARM.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/IntrinsicsX86.h" -#include…
#include "llvm/IR/IntrinsicsARM.h"		#include "llvm/IR/IntrinsicsARM.h"
#include "llvm/IR/IntrinsicsAArch64.h"		#include "llvm/IR/IntrinsicsAArch64.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/IntrinsicsARM.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include…
#include "llvm/IR/IntrinsicsHexagon.h"		#include "llvm/IR/IntrinsicsHexagon.h"
#include "llvm/IR/IntrinsicsNVPTX.h"		#include "llvm/IR/IntrinsicsNVPTX.h"
#include "llvm/IR/IntrinsicsAMDGPU.h"		#include "llvm/IR/IntrinsicsAMDGPU.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/IntrinsicsAMDGPU.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/IntrinsicsAMDGPU.h" ```
#include "llvm/IR/IntrinsicsPowerPC.h"		#include "llvm/IR/IntrinsicsPowerPC.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/IntrinsicsX86.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/IntrinsicsX86.h" ```
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/PredicatedInst.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/PredicatedInst.h" ```
#include "llvm/IR/Statepoint.h"		#include "llvm/IR/Statepoint.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"		#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Support/AtomicOrdering.h"		#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
▲ Show 20 Lines • Show All 1,729 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitCallInst(CallInst &CI) {

// If the caller function is nounwind, mark the call as nounwind, even if the		// If the caller function is nounwind, mark the call as nounwind, even if the
// callee isn't.		// callee isn't.
if (CI.getFunction()->doesNotThrow() && !CI.doesNotThrow()) {		if (CI.getFunction()->doesNotThrow() && !CI.doesNotThrow()) {
CI.setDoesNotThrow();		CI.setDoesNotThrow();
return &CI;		return &CI;
}		}

		// Predicated instruction patterns
		auto * VPInst = dyn_cast<VPIntrinsic>(&CI);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto * VPInst = dyn_cast<VPIntrinsic>(&CI); + auto VPInst = dyn_cast<VPIntrinsic>(&CI); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - auto * VPInst = dyn_cast<VPIntrinsic>(&CI); +…
		if (VPInst) {
		auto * PredInst = cast<PredicatedInstruction>(VPInst);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto * PredInst = cast<PredicatedInstruction>(VPInst); + auto PredInst = cast<PredicatedInstruction>(VPInst); Lint: Pre-merge checks:* clang-format: please reformat the code ``` - auto * PredInst = cast<PredicatedInstruction>…
		auto Result = visitPredicatedInstruction(PredInst);
		if (Result) return Result;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (Result) return Result; + if (Result) + return Result; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (Result) return Result; + if (Result) +…
		}

IntrinsicInst *II = dyn_cast<IntrinsicInst>(&CI);		IntrinsicInst *II = dyn_cast<IntrinsicInst>(&CI);
if (!II) return visitCallBase(CI);		if (!II) return visitCallBase(CI);

// Intrinsics cannot occur in an invoke or a callbr, so handle them here		// Intrinsics cannot occur in an invoke or a callbr, so handle them here
// instead of in visitCallBase.		// instead of in visitCallBase.
if (auto *MI = dyn_cast<AnyMemIntrinsic>(II)) {		if (auto *MI = dyn_cast<AnyMemIntrinsic>(II)) {
bool Changed = false;		bool Changed = false;

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (auto *MI = dyn_cast<AnyMemIntrinsic>(II)) {
} else if (auto *MSI = dyn_cast<AnyMemSetInst>(MI)) {		} else if (auto *MSI = dyn_cast<AnyMemSetInst>(MI)) {
if (Instruction *I = SimplifyAnyMemSet(MSI))		if (Instruction *I = SimplifyAnyMemSet(MSI))
return I;		return I;
}		}

if (Changed) return II;		if (Changed) return II;
}		}

// For vector result intrinsics, use the generic demanded vector support.		// For vector result intrinsics, use the generic demanded vector support to
		// simplify any operands before moving on to the per-intrinsic rules.
if (II->getType()->isVectorTy()) {		if (II->getType()->isVectorTy()) {
auto VWidth = II->getType()->getVectorNumElements();		auto VWidth = II->getType()->getVectorNumElements();
APInt UndefElts(VWidth, 0);		APInt UndefElts(VWidth, 0);
APInt AllOnesEltMask(APInt::getAllOnesValue(VWidth));		APInt AllOnesEltMask(APInt::getAllOnesValue(VWidth));
if (Value *V = SimplifyDemandedVectorElts(II, AllOnesEltMask, UndefElts)) {		if (Value *V = SimplifyDemandedVectorElts(II, AllOnesEltMask, UndefElts)) {
if (V != II)		if (V != II)
return replaceInstUsesWith(*II, V);		return replaceInstUsesWith(*II, V);
return II;		return II;
▲ Show 20 Lines • Show All 3,118 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show All 24 Lines
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstVisitor.h"		#include "llvm/IR/InstVisitor.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/IR/PredicatedInst.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -#include "llvm/IR/PredicatedInst.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` -#include "llvm/IR/PredicatedInst.h" ```
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code +#include "llvm/IR/PredicatedInst.h" Lint: Pre-merge checks: clang-format: please reformat the code ``` +#include "llvm/IR/PredicatedInst.h" ```
#include "llvm/IR/Use.h"		#include "llvm/IR/Use.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"		#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"
▲ Show 20 Lines • Show All 324 Lines • ▼ Show 20 Lines	public:
// otherwise - Change was made, replace I with returned instruction		// otherwise - Change was made, replace I with returned instruction
//		//
Instruction *visitFNeg(UnaryOperator &I);		Instruction *visitFNeg(UnaryOperator &I);
Instruction *visitAdd(BinaryOperator &I);		Instruction *visitAdd(BinaryOperator &I);
Instruction *visitFAdd(BinaryOperator &I);		Instruction *visitFAdd(BinaryOperator &I);
Value *OptimizePointerDifference(		Value *OptimizePointerDifference(
Value LHS, Value RHS, Type *Ty, bool isNUW);		Value LHS, Value RHS, Type *Ty, bool isNUW);
Instruction *visitSub(BinaryOperator &I);		Instruction *visitSub(BinaryOperator &I);
		template<typename BinaryOpTy, typename MatcherType> Instruction *visitFSubGeneric(BinaryOpTy &I);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - template<typename BinaryOpTy, typename MatcherType> Instruction visitFSubGeneric(BinaryOpTy &I); + template <typename BinaryOpTy, typename MatcherType> + Instruction visitFSubGeneric(BinaryOpTy &I); Lint: Pre-merge checks: clang-format: please reformat the code ``` - template<typename BinaryOpTy, typename…
		Instruction *visitPredicatedFSub(PredicatedBinaryOperator &I);
Instruction *visitFSub(BinaryOperator &I);		Instruction *visitFSub(BinaryOperator &I);
Instruction *visitMul(BinaryOperator &I);		Instruction *visitMul(BinaryOperator &I);
Instruction *visitFMul(BinaryOperator &I);		Instruction *visitFMul(BinaryOperator &I);
Instruction *visitURem(BinaryOperator &I);		Instruction *visitURem(BinaryOperator &I);
Instruction *visitSRem(BinaryOperator &I);		Instruction *visitSRem(BinaryOperator &I);
Instruction *visitFRem(BinaryOperator &I);		Instruction *visitFRem(BinaryOperator &I);
bool simplifyDivRemOfSelectWithZeroOp(BinaryOperator &I);		bool simplifyDivRemOfSelectWithZeroOp(BinaryOperator &I);
Instruction *commonRemTransforms(BinaryOperator &I);		Instruction *commonRemTransforms(BinaryOperator &I);
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	public:
Instruction *visitExtractElementInst(ExtractElementInst &EI);		Instruction *visitExtractElementInst(ExtractElementInst &EI);
Instruction *visitShuffleVectorInst(ShuffleVectorInst &SVI);		Instruction *visitShuffleVectorInst(ShuffleVectorInst &SVI);
Instruction *visitExtractValueInst(ExtractValueInst &EV);		Instruction *visitExtractValueInst(ExtractValueInst &EV);
Instruction *visitLandingPadInst(LandingPadInst &LI);		Instruction *visitLandingPadInst(LandingPadInst &LI);
Instruction *visitVAStartInst(VAStartInst &I);		Instruction *visitVAStartInst(VAStartInst &I);
Instruction *visitVACopyInst(VACopyInst &I);		Instruction *visitVACopyInst(VACopyInst &I);
Instruction *visitFreeze(FreezeInst &I);		Instruction *visitFreeze(FreezeInst &I);

		// Entry point to VPIntrinsic
		Instruction visitPredicatedInstruction(PredicatedInstruction PI) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Instruction visitPredicatedInstruction(PredicatedInstruction PI) { + Instruction visitPredicatedInstruction(PredicatedInstruction PI) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - Instruction *visitPredicatedInstruction…
		switch (PI->getOpcode()) {
		default:
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - default: - return nullptr; - case Instruction::FSub: - return visitPredicatedFSub(cast<PredicatedBinaryOperator>(PI)); + default: + return nullptr; + case Instruction::FSub: + return visitPredicatedFSub(cast<PredicatedBinaryOperator>(PI)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - default: - return nullptr; - case…
		return nullptr;
		case Instruction::FSub:
		return visitPredicatedFSub(cast<PredicatedBinaryOperator>(*PI));
		}
		}

/// Specify what to return for unhandled instructions.		/// Specify what to return for unhandled instructions.
Instruction *visitInstruction(Instruction &I) { return nullptr; }		Instruction *visitInstruction(Instruction &I) { return nullptr; }

/// True when DB dominates all uses of DI except UI.		/// True when DB dominates all uses of DI except UI.
/// UI must be in the same block as DI.		/// UI must be in the same block as DI.
/// The routine checks that the DI parent and DB are different.		/// The routine checks that the DI parent and DB are different.
bool dominatesAllUses(const Instruction DI, const Instruction UI,		bool dominatesAllUses(const Instruction DI, const Instruction UI,
const BasicBlock *DB) const;		const BasicBlock *DB) const;
▲ Show 20 Lines • Show All 560 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/CodeExtractor.cpp

Show First 20 Lines • Show All 857 Lines • ▼ Show 20 Lines	if (Attr.isStringAttribute()) {
case Attribute::Convergent:		case Attribute::Convergent:
case Attribute::Dereferenceable:		case Attribute::Dereferenceable:
case Attribute::DereferenceableOrNull:		case Attribute::DereferenceableOrNull:
case Attribute::InAlloca:		case Attribute::InAlloca:
case Attribute::InReg:		case Attribute::InReg:
case Attribute::InaccessibleMemOnly:		case Attribute::InaccessibleMemOnly:
case Attribute::InaccessibleMemOrArgMemOnly:		case Attribute::InaccessibleMemOrArgMemOnly:
case Attribute::JumpTable:		case Attribute::JumpTable:
		case Attribute::Mask:
case Attribute::Naked:		case Attribute::Naked:
case Attribute::Nest:		case Attribute::Nest:
case Attribute::NoAlias:		case Attribute::NoAlias:
case Attribute::NoBuiltin:		case Attribute::NoBuiltin:
case Attribute::NoCapture:		case Attribute::NoCapture:
case Attribute::NoReturn:		case Attribute::NoReturn:
case Attribute::NoSync:		case Attribute::NoSync:
case Attribute::None:		case Attribute::None:
case Attribute::NonNull:		case Attribute::NonNull:
		case Attribute::Passthru:
case Attribute::ReadNone:		case Attribute::ReadNone:
case Attribute::ReadOnly:		case Attribute::ReadOnly:
case Attribute::Returned:		case Attribute::Returned:
case Attribute::ReturnsTwice:		case Attribute::ReturnsTwice:
case Attribute::SExt:		case Attribute::SExt:
case Attribute::Speculatable:		case Attribute::Speculatable:
case Attribute::StackAlignment:		case Attribute::StackAlignment:
case Attribute::StructRet:		case Attribute::StructRet:
case Attribute::SwiftError:		case Attribute::SwiftError:
case Attribute::SwiftSelf:		case Attribute::SwiftSelf:
case Attribute::WillReturn:		case Attribute::WillReturn:
		case Attribute::VectorLength:
case Attribute::WriteOnly:		case Attribute::WriteOnly:
case Attribute::ZExt:		case Attribute::ZExt:
case Attribute::ImmArg:		case Attribute::ImmArg:
case Attribute::EndAttrKinds:		case Attribute::EndAttrKinds:
continue;		continue;
// Those attributes should be safe to propagate to the extracted function.		// Those attributes should be safe to propagate to the extracted function.
case Attribute::AlwaysInline:		case Attribute::AlwaysInline:
case Attribute::Cold:		case Attribute::Cold:
▲ Show 20 Lines • Show All 855 Lines • Show Last 20 Lines

llvm/test/Bitcode/attributes.ll

	Show First 20 Lines • Show All 368 Lines • ▼ Show 20 Lines
	}			}

	; CHECK: define void @f63() #39			; CHECK: define void @f63() #39
	define void @f63() sanitize_memtag			define void @f63() sanitize_memtag
	{			{
	ret void;			ret void;
	}			}

				; CHECK: define <8 x double> @f64(<8 x double> passthru %0, <8 x i1> mask %1, i32 vlen %2) {
				define <8 x double> @f64(<8 x double> passthru, <8 x i1> mask, i32 vlen) {
				ret <8 x double> undef
				}

	; CHECK: attributes #0 = { noreturn }			; CHECK: attributes #0 = { noreturn }
	; CHECK: attributes #1 = { nounwind }			; CHECK: attributes #1 = { nounwind }
	; CHECK: attributes #2 = { readnone }			; CHECK: attributes #2 = { readnone }
	; CHECK: attributes #3 = { readonly }			; CHECK: attributes #3 = { readonly }
	; CHECK: attributes #4 = { noinline }			; CHECK: attributes #4 = { noinline }
	; CHECK: attributes #5 = { alwaysinline }			; CHECK: attributes #5 = { alwaysinline }
	; CHECK: attributes #6 = { optsize }			; CHECK: attributes #6 = { optsize }
	; CHECK: attributes #7 = { ssp }			; CHECK: attributes #7 = { ssp }
	Show All 33 Lines

llvm/test/CodeGen/AArch64/O0-pipeline.ll

	Show All 19 Lines
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Lower constant intrinsics			; CHECK-NEXT: Lower constant intrinsics
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
				; CHECK-NEXT: Expand vector predication intrinsics
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: AArch64 Stack Tagging			; CHECK-NEXT: AArch64 Stack Tagging
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/O3-pipeline.ll

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
				; CHECK-NEXT: Expand vector predication intrinsics
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory SSA			; CHECK-NEXT: Memory SSA
	; CHECK-NEXT: Interleaved Load Combine Pass			; CHECK-NEXT: Interleaved Load Combine Pass
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	▲ Show 20 Lines • Show All 137 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/O3-pipeline.ll

	Show All 29 Lines
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
				; CHECK-NEXT: Expand vector predication intrinsics
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Transform functions to use DSP intrinsics			; CHECK-NEXT: Transform functions to use DSP intrinsics
	▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

llvm/test/CodeGen/Generic/expand-vp.ll

This file was added.

				; RUN: opt --expand-vec-pred -S < %s \| FileCheck %s

				define void @test_vp_int(<8 x i32> %i0, <8 x i32> %i1, <8 x i32> %i2, <8 x i32> %f3, <8 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.add}}
				; CHECK-NOT: {{call.* @llvm.vp.sub}}
				; CHECK-NOT: {{call.* @llvm.vp.mul}}
				; CHECK-NOT: {{call.* @llvm.vp.sdiv}}
				; CHECK-NOT: {{call.* @llvm.vp.srem}}
				; CHECK-NOT: {{call.* @llvm.vp.udiv}}
				; CHECK-NOT: {{call.* @llvm.vp.urem}}
				; CHECK-NOT: {{call.* @llvm.vp.and}}
				; CHECK-NOT: {{call.* @llvm.vp.or}}
				; CHECK-NOT: {{call.* @llvm.vp.xor}}
				; CHECK-NOT: {{call.* @llvm.vp.ashr}}
				; CHECK-NOT: {{call.* @llvm.vp.lshr}}
				; CHECK-NOT: {{call.* @llvm.vp.shl}}
				%r0 = call <8 x i32> @llvm.vp.add.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r1 = call <8 x i32> @llvm.vp.sub.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r2 = call <8 x i32> @llvm.vp.mul.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r3 = call <8 x i32> @llvm.vp.sdiv.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r4 = call <8 x i32> @llvm.vp.srem.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r5 = call <8 x i32> @llvm.vp.udiv.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r6 = call <8 x i32> @llvm.vp.urem.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r7 = call <8 x i32> @llvm.vp.and.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r8 = call <8 x i32> @llvm.vp.or.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r9 = call <8 x i32> @llvm.vp.xor.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rA = call <8 x i32> @llvm.vp.ashr.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rB = call <8 x i32> @llvm.vp.lshr.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rC = call <8 x i32> @llvm.vp.shl.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_constrainedfp(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, <8 x double> %f3, <8 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.reduce}}
				; CHECK-NOT: {{call.* @llvm.vp.fadd}}
				; CHECK-NOT: {{call.* @llvm.vp.fsub}}
				; CHECK-NOT: {{call.* @llvm.vp.fmul}}
				; CHECK-NOT: {{call.* @llvm.vp.frem}}
				; CHECK-NOT: {{call.* @llvm.vp.fma}}
				; CHECK-NOT: {{call.* @llvm.vp.fneg}}
				%r0 = call <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.fmul.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.fdiv.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.frem.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r5 = call <8 x double> @llvm.vp.fma.v8f64(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r6 = call <8 x double> @llvm.vp.fneg.v8f64(<8 x double> %f2, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.minnum.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r8 = call <8 x double> @llvm.vp.maxnum.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_fpcast(<8 x double> %x, <8 x i64> %y, <8 x float> %z, <8 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.fptosi}}
				; CHECK-NOT: {{call.* @llvm.vp.fptoui}}
				; CHECK-NOT: {{call.* @llvm.vp.sitofp}}
				; CHECK-NOT: {{call.* @llvm.vp.uitofp}}
				; CHECK-NOT: {{call.* @llvm.vp.rint}}
				; CHECK-NOT: {{call.* @llvm.vp.round}}
				; CHECK-NOT: {{call.* @llvm.vp.nearbyint}}
				; CHECK-NOT: {{call.* @llvm.vp.ceil}}
				; CHECK-NOT: {{call.* @llvm.vp.floor}}
				; CHECK-NOT: {{call.* @llvm.vp.trunc}}
				; CHECK-NOT: {{call.* @llvm.vp.fptrunc}}
				; CHECK-NOT: {{call.* @llvm.vp.fpext}}
				%r0 = call <8 x i64> @llvm.vp.fptosi.v8i64v8f64(<8 x double> %x, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x i64> @llvm.vp.fptoui.v8i64v8f64(<8 x double> %x, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.sitofp.v8f64v8i64(<8 x i64> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.uitofp.v8f64v8i64(<8 x i64> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.rint.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.round.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rA = call <8 x double> @llvm.vp.nearbyint.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rB = call <8 x double> @llvm.vp.ceil.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rC = call <8 x double> @llvm.vp.floor.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rD = call <8 x double> @llvm.vp.trunc.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rE = call <8 x float> @llvm.vp.fptrunc.v8f32v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rF = call <8 x double> @llvm.vp.fpext.v8f64v8f32(<8 x float> %z, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_fpfuncs(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.pow}}
				; CHECK-NOT: {{call.* @llvm.vp.sqrt}}
				; CHECK-NOT: {{call.* @llvm.vp.sin}}
				; CHECK-NOT: {{call.* @llvm.vp.cos}}
				; CHECK-NOT: {{call.* @llvm.vp.log}}
				; CHECK-NOT: {{call.* @llvm.vp.exp\.}}
				; CHECK-NOT: {{call.* @llvm.vp.exp2}}
				%r0 = call <8 x double> @llvm.vp.pow.v8f64(<8 x double> %x, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x double> @llvm.vp.powi.v8f64(<8 x double> %x, i32 %n, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.sqrt.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.sin.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.cos.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r5 = call <8 x double> @llvm.vp.log.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r6 = call <8 x double> @llvm.vp.log10.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.log2.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r8 = call <8 x double> @llvm.vp.exp.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r9 = call <8 x double> @llvm.vp.exp2.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_mem(<16 x i32> %p0, <16 x i32> %p1, <16 x i32> %i0, <16 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.load}}
				; CHECK-NOT: {{call.* @llvm.vp.store}}
				; CHECK-NOT: {{call.* @llvm.vp.gather}}
				; CHECK-NOT: {{call.* @llvm.vp.scatter}}
				call void @llvm.vp.store.v16i32.p0v16i32(<16 x i32> %i0, <16 x i32>* %p1, <16 x i1> %m, i32 %n)
				call void @llvm.vp.scatter.v16i32.v16p0i32(<16 x i32> %i0 , <16 x i32*> %p0, <16 x i1> %m, i32 %n)
				%l0 = call <16 x i32> @llvm.vp.load.v16i32.p0v16i32(<16 x i32>* %p1, <16 x i1> %m, i32 %n)
				%l1 = call <16 x i32> @llvm.vp.gather.v16i32.v16p0i32(<16 x i32*> %p0, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_reduce_fp(<16 x float> %v, <16 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.reduce.fadd}}
				; CHECK-NOT: {{call.* @llvm.vp.reduce.fmul}}
				; CHECK-NOT: {{call.* @llvm.vp.reduce.fmin}}
				; CHECK-NOT: {{call.* @llvm.vp.reduce.fmax}}
				%r0 = call float @llvm.vp.reduce.fadd.v16f32(float 0.0, <16 x float> %v, <16 x i1> %m, i32 %n)
				%r1 = call float @llvm.vp.reduce.fmul.v16f32(float 42.0, <16 x float> %v, <16 x i1> %m, i32 %n)
				%r2 = call float @llvm.vp.reduce.fmin.v16f32(<16 x float> %v, <16 x i1> %m, i32 %n)
				%r3 = call float @llvm.vp.reduce.fmax.v16f32(<16 x float> %v, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_reduce_int(<16 x i32> %v, <16 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.reduce}}
				%r0 = call i32 @llvm.vp.reduce.add.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r1 = call i32 @llvm.vp.reduce.mul.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r2 = call i32 @llvm.vp.reduce.and.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r3 = call i32 @llvm.vp.reduce.xor.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r4 = call i32 @llvm.vp.reduce.or.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r5 = call i32 @llvm.vp.reduce.smin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r6 = call i32 @llvm.vp.reduce.smax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r7 = call i32 @llvm.vp.reduce.umin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r8 = call i32 @llvm.vp.reduce.umax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_shuffle(<16 x float> %v0, <16 x float> %v1, <16 x i1> %m, i32 %k, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.select}}
				; CHECK-NOT: {{call.* @llvm.vp.compose}}
				; no generic lowering available: {{call.* @llvm.vp.compress}}
				; no generic lowering available: {{call.* @llvm.vp.expand}}
				; CHECK-NOT: {{call.* @llvm.vp.vshift}}
				%r0 = call <16 x float> @llvm.vp.select.v16f32(<16 x i1> %m, <16 x float> %v0, <16 x float> %v1, i32 %n)
				%r1 = call <16 x float> @llvm.vp.compose.v16f32(<16 x float> %v0, <16 x float> %v1, i32 %k, i32 %n)
				%r2 = call <16 x float> @llvm.vp.vshift.v16f32(<16 x float> %v0, i32 7, <16 x i1> %m, i32 %n)
				%r3 = call <16 x float> @llvm.vp.compress.v16f32(<16 x float> %v0, <16 x i1> %m, i32 %n)
				%r4 = call <16 x float> @llvm.vp.expand.v16f32(<16 x float> %v0, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_xcmp(<16 x i32> %i0, <16 x i32> %i1, <16 x float> %f0, <16 x float> %f1, <16 x i1> %m, i32 %n) {
				; CHECK-NOT: {{call.* @llvm.vp.icmp}}
				; CHECK-NOT: {{call.* @llvm.vp.fcmp}}
				%r0 = call <16 x i1> @llvm.vp.icmp.v16i32(<16 x i32> %i0, <16 x i32> %i1, i8 38, <16 x i1> %m, i32 %n)
				%r1 = call <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> %f0, <16 x float> %f1, i8 10, <16 x i1> %m, i32 %n)
				ret void
				}

				; integer arith
				declare <8 x i32> @llvm.vp.add.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.sub.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.mul.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.sdiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.srem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.udiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.urem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				; bit arith
				declare <8 x i32> @llvm.vp.and.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.xor.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.or.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.ashr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.lshr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.shl.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)

				; floating point arith
				declare <8 x double> @llvm.vp.fadd.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fsub.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fmul.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fdiv.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.frem.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fma.v8f64(<8 x double>, <8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fneg.v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.minnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.maxnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)

				; cast & conversions
				declare <8 x i64> @llvm.vp.fptosi.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x i64> @llvm.vp.fptoui.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.uitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.rint.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.round.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.nearbyint.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.ceil.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.floor.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.trunc.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x float> @llvm.vp.fptrunc.v8f32v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fpext.v8f64v8f32(<8 x float> %x, metadata, <8 x i1> mask, i32 vlen)

				; math ops
				declare <8 x double> @llvm.vp.pow.v8f64(<8 x double> %x, <8 x double> %y, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.powi.v8f64(<8 x double> %x, i32 %y, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sqrt.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sin.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.cos.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log10.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log2.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.exp.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.exp2.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)

				; memory
				declare void @llvm.vp.store.v16i32.p0v16i32(<16 x i32>, <16 x i32>*, <16 x i1> mask, i32 vlen)
				declare <16 x i32> @llvm.vp.load.v16i32.p0v16i32(<16 x i32>*, <16 x i1> mask, i32 vlen)
				declare void @llvm.vp.scatter.v16i32.v16p0i32(<16 x i32>, <16 x i32*>, <16 x i1> mask, i32 vlen)
				declare <16 x i32> @llvm.vp.gather.v16i32.v16p0i32(<16 x i32*>, <16 x i1> mask, i32 vlen)

				; reductions
				declare float @llvm.vp.reduce.fadd.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmul.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmin.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmax.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.add.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.mul.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.and.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.xor.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.or.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.smax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.smin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.umax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.umin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)

				; shuffles
				declare <16 x float> @llvm.vp.select.v16f32(<16 x i1>, <16 x float>, <16 x float>, i32 vlen)
				declare <16 x float> @llvm.vp.compose.v16f32(<16 x float>, <16 x float>, i32, i32 vlen)
				declare <16 x float> @llvm.vp.vshift.v16f32(<16 x float>, i32, <16 x i1>, i32 vlen)
				declare <16 x float> @llvm.vp.compress.v16f32(<16 x float>, <16 x i1>, i32 vlen)
				declare <16 x float> @llvm.vp.expand.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)

				; icmp , fcmp
				declare <16 x i1> @llvm.vp.icmp.v16i32(<16 x i32>, <16 x i32>, i8, <16 x i1> mask, i32 vlen)
				declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float>, <16 x float>, i8, <16 x i1> mask, i32 vlen)

llvm/test/CodeGen/X86/O0-pipeline.ll

	Show All 22 Lines
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Lower constant intrinsics			; CHECK-NEXT: Lower constant intrinsics
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
				; CHECK-NEXT: Expand vector predication intrinsics
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Expand indirectbr instructions			; CHECK-NEXT: Expand indirectbr instructions
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/O3-pipeline.ll

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis			; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis			; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Constant Hoisting			; CHECK-NEXT: Constant Hoisting
	; CHECK-NEXT: Partially inline calls to library functions			; CHECK-NEXT: Partially inline calls to library functions
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
				; CHECK-NEXT: Expand vector predication intrinsics
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Interleaved Access Pass			; CHECK-NEXT: Interleaved Access Pass
	; CHECK-NEXT: Expand indirectbr instructions			; CHECK-NEXT: Expand indirectbr instructions
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: CodeGen Prepare			; CHECK-NEXT: CodeGen Prepare
	▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/vp-fsub.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				; PR4374

				define <4 x float> @test1_vp(<4 x float> %x, <4 x float> %y, <4 x i1> %M, i32 %L) {
				; CHECK-LABEL: @test1_vp(
				;
				%t1 = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %x, <4 x float> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				%t2 = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>, <4 x float> %t1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				ret <4 x float> %t2
				}

				; Can't do anything with the test above because -0.0 - 0.0 = -0.0, but if we have nsz:
				; -(X - Y) --> Y - X

				; TODO predicated FAdd folding
				define <4 x float> @neg_sub_nsz_vp(<4 x float> %x, <4 x float> %y, <4 x i1> %M, i32 %L) {
				; CH***-LABEL: @neg_sub_nsz_vp(
				;
				%t1 = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %x, <4 x float> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				%t2 = call nsz <4 x float> @llvm.vp.fsub.v4f32(<4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>, <4 x float> %t1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				ret <4 x float> %t2
				}

				; With nsz: Z - (X - Y) --> Z + (Y - X)

				define <4 x float> @sub_sub_nsz_vp(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x i1> %M, i32 %L) {
				; CHECK-LABEL: @sub_sub_nsz_vp(
				; CHECK-NEXT: %1 = call nsz <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %y, <4 x float> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #
				; CHECK-NEXT: %t2 = call nsz <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %z, <4 x float> %1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #
				; CHECK-NEXT: ret <4 x float> %t2
				%t1 = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %x, <4 x float> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				%t2 = call nsz <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %z, <4 x float> %t1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <4 x i1> %M, i32 %L) #0
				ret <4 x float> %t2
				}



				; Function Attrs: nounwind readnone
				declare <4 x float> @llvm.vp.fadd.v4f32(<4 x float>, <4 x float>, metadata, metadata, <4 x i1> mask, i32 vlen)

				; Function Attrs: nounwind readnone
				declare <4 x float> @llvm.vp.fsub.v4f32(<4 x float>, <4 x float>, metadata, metadata, <4 x i1> mask, i32 vlen)

				attributes #0 = { readnone }

llvm/test/Transforms/InstSimplify/vp-fsub.ll

This file was added.

				; RUN: opt < %s -instsimplify -S \| FileCheck %s

				define <8 x double> @fsub_fadd_fold_vp_xy(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %len) {
				; CHECK-LABEL: fsub_fadd_fold_vp_xy
				; CHECK: ret <8 x double> %x
				%tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %x, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				%res0 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				ret <8 x double> %res0
				}

				define <8 x double> @fsub_fadd_fold_vp_zw(<8 x double> %z, <8 x double> %w, <8 x i1> %m, i32 %len) {
				; CHECK-LABEL: fsub_fadd_fold_vp_zw
				; CHECK: ret <8 x double> %z
				%tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %w, <8 x double> %z, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				%res1 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %w, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				ret <8 x double> %res1
				}

				; REQUIRES-CONSTRAINED-VP: define <8 x double> @fsub_fadd_fold_vp_yx_fpexcept(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %len) #0 {
				; REQUIRES-CONSTRAINED-VP: ; *HECK-LABEL: fsub_fadd_fold_vp_yx
				; REQUIRES-CONSTRAINED-VP: ; *HECK-NEXT: %tmp =
				; REQUIRES-CONSTRAINED-VP: ; *HECK-NEXT: %res2 =
				; REQUIRES-CONSTRAINED-VP: ; *HECK-NEXT: ret
				; REQUIRES-CONSTRAINED-VP: %tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %y, <8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.strict", <8 x i1> %m, i32 %len)
				; REQUIRES-CONSTRAINED-VP: %res2 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.strict", <8 x i1> %m, i32 %len)
				; REQUIRES-CONSTRAINED-VP: ret <8 x double> %res2
				; REQUIRES-CONSTRAINED-VP: }

				define <8 x double> @fsub_fadd_fold_vp_yx_olen(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %len, i32 %otherLen) {
				; CHECK-LABEL: fsub_fadd_fold_vp_yx_olen
				; CHECK-NEXT: %tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %y, <8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %otherLen)
				; CHECK-NEXT: %res3 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				; CHECK-NEXT: ret <8 x double> %res3
				%tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %y, <8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %otherLen)
				%res3 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				ret <8 x double> %res3
				}

				define <8 x double> @fsub_fadd_fold_vp_yx_omask(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %len, <8 x i1> %othermask) {
				; CHECK-LABEL: fsub_fadd_fold_vp_yx_omask
				; CHECK-NEXT: %tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %y, <8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				; CHECK-NEXT: %res4 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %othermask, i32 %len)
				; CHECK-NEXT: ret <8 x double> %res4
				%tmp = call reassoc nsz <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %y, <8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %len)
				%res4 = call reassoc nsz <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %tmp, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %othermask, i32 %len)
				ret <8 x double> %res4
				}

				; Function Attrs: nounwind readnone
				declare <8 x double> @llvm.vp.fadd.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)

				; Function Attrs: nounwind readnone
				declare <8 x double> @llvm.vp.fsub.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)

				attributes #0 = { strictfp }

llvm/test/Verifier/vp-intrinsics-constrained.ll

This file was added.

				; RUN: not opt -S < %s \|& FileCheck %s
				; CHECK: VP intrinsics only support the default fp environment for now (round.tonearest; fpexcept.ignore).
				; CHECK: error: input module is broken!

				define void @test_vp_strictfp(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, <8 x double> %f3, <8 x i1> %m, i32 %n) #0 {
				%r0 = call <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.strict", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_rounding(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, <8 x double> %f3, <8 x i1> %m, i32 %n) #0 {
				%r0 = call <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tozero", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				declare <8 x double> @llvm.vp.fadd.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)

				attributes #0 = { strictfp }

llvm/test/Verifier/vp-intrinsics.ll

This file was added.

				; RUN: opt --verify %s

				define void @test_vp_int(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n) {
				%r0 = call <8 x i32> @llvm.vp.add.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r1 = call <8 x i32> @llvm.vp.sub.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r2 = call <8 x i32> @llvm.vp.mul.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r3 = call <8 x i32> @llvm.vp.sdiv.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r4 = call <8 x i32> @llvm.vp.srem.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r5 = call <8 x i32> @llvm.vp.udiv.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r6 = call <8 x i32> @llvm.vp.urem.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r7 = call <8 x i32> @llvm.vp.and.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r8 = call <8 x i32> @llvm.vp.or.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%r9 = call <8 x i32> @llvm.vp.xor.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rA = call <8 x i32> @llvm.vp.ashr.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rB = call <8 x i32> @llvm.vp.lshr.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				%rC = call <8 x i32> @llvm.vp.shl.v8i32(<8 x i32> %i0, <8 x i32> %i1, <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_constrainedfp(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, <8 x double> %f3, <8 x i1> %m, i32 %n) {
				%r0 = call <8 x double> @llvm.vp.fadd.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x double> @llvm.vp.fsub.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.fmul.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.fdiv.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.frem.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r5 = call <8 x double> @llvm.vp.fma.v8f64(<8 x double> %f0, <8 x double> %f1, <8 x double> %f2, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r6 = call <8 x double> @llvm.vp.fneg.v8f64(<8 x double> %f2, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.minnum.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r8 = call <8 x double> @llvm.vp.maxnum.v8f64(<8 x double> %f0, <8 x double> %f1, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_fpcast(<8 x double> %x, <8 x i64> %y, <8 x float> %z, <8 x i1> %m, i32 %n) {
				%r0 = call <8 x i64> @llvm.vp.fptosi.v8i64v8f64(<8 x double> %x, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x i64> @llvm.vp.fptoui.v8i64v8f64(<8 x double> %x, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.sitofp.v8f64v8i64(<8 x i64> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.uitofp.v8f64v8i64(<8 x i64> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.rint.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.round.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rA = call <8 x double> @llvm.vp.nearbyint.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rB = call <8 x double> @llvm.vp.ceil.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rC = call <8 x double> @llvm.vp.floor.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rD = call <8 x double> @llvm.vp.trunc.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rE = call <8 x float> @llvm.vp.fptrunc.v8f32v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%rF = call <8 x double> @llvm.vp.fpext.v8f64v8f32(<8 x float> %z, metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_vp_fpfuncs(<8 x double> %x, <8 x double> %y, <8 x i1> %m, i32 %n) {
				%r0 = call <8 x double> @llvm.vp.pow.v8f64(<8 x double> %x, <8 x double> %y, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r1 = call <8 x double> @llvm.vp.powi.v8f64(<8 x double> %x, i32 %n, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r2 = call <8 x double> @llvm.vp.sqrt.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r3 = call <8 x double> @llvm.vp.sin.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r4 = call <8 x double> @llvm.vp.cos.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r5 = call <8 x double> @llvm.vp.log.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r6 = call <8 x double> @llvm.vp.log10.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r7 = call <8 x double> @llvm.vp.log2.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r8 = call <8 x double> @llvm.vp.exp.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				%r9 = call <8 x double> @llvm.vp.exp2.v8f64(<8 x double> %x, metadata !"round.tonearest", metadata !"fpexcept.ignore", <8 x i1> %m, i32 %n)
				ret void
				}

				define void @test_mem(<16 x i32> %p0, <16 x i32> %p1, <16 x i32> %i0, <16 x i1> %m, i32 %n) {
				call void @llvm.vp.store.v16i32.p0v16i32(<16 x i32> %i0, <16 x i32>* %p1, <16 x i1> %m, i32 %n)
				call void @llvm.vp.scatter.v16i32.v16p0i32(<16 x i32> %i0 , <16 x i32*> %p0, <16 x i1> %m, i32 %n)
				%l0 = call <16 x i32> @llvm.vp.load.v16i32.p0v16i32(<16 x i32>* %p1, <16 x i1> %m, i32 %n)
				%l1 = call <16 x i32> @llvm.vp.gather.v16i32.v16p0i32(<16 x i32*> %p0, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_reduce_fp(<16 x float> %v, <16 x i1> %m, i32 %n) {
				%r0 = call float @llvm.vp.reduce.fadd.v16f32(float 0.0, <16 x float> %v, <16 x i1> %m, i32 %n)
				%r1 = call float @llvm.vp.reduce.fmul.v16f32(float 42.0, <16 x float> %v, <16 x i1> %m, i32 %n)
				%r2 = call float @llvm.vp.reduce.fmin.v16f32(<16 x float> %v, <16 x i1> %m, i32 %n)
				%r3 = call float @llvm.vp.reduce.fmax.v16f32(<16 x float> %v, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_reduce_int(<16 x i32> %v, <16 x i1> %m, i32 %n) {
				%r0 = call i32 @llvm.vp.reduce.add.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r1 = call i32 @llvm.vp.reduce.mul.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r2 = call i32 @llvm.vp.reduce.and.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r3 = call i32 @llvm.vp.reduce.xor.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r4 = call i32 @llvm.vp.reduce.or.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r5 = call i32 @llvm.vp.reduce.smin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r6 = call i32 @llvm.vp.reduce.smax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r7 = call i32 @llvm.vp.reduce.umin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				%r8 = call i32 @llvm.vp.reduce.umax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_shuffle(<16 x float> %v0, <16 x float> %v1, <16 x i1> %m, i32 %k, i32 %n) {
				%r0 = call <16 x float> @llvm.vp.select.v16f32(<16 x i1> %m, <16 x float> %v0, <16 x float> %v1, i32 %n)
				%r1 = call <16 x float> @llvm.vp.compose.v16f32(<16 x float> %v0, <16 x float> %v1, i32 %k, i32 %n)
				%r2 = call <16 x float> @llvm.vp.vshift.v16f32(<16 x float> %v0, i32 %k, <16 x i1> %m, i32 %n)
				%r3 = call <16 x float> @llvm.vp.compress.v16f32(<16 x float> %v0, <16 x i1> %m, i32 %n)
				%r4 = call <16 x float> @llvm.vp.expand.v16f32(<16 x float> %v0, <16 x i1> %m, i32 %n)
				ret void
				}

				define void @test_xcmp(<16 x i32> %i0, <16 x i32> %i1, <16 x float> %f0, <16 x float> %f1,<16 x i1> %m, i32 %n) {
				%r0 = call <16 x i1> @llvm.vp.icmp.v16i32(<16 x i32> %i0, <16 x i32> %i1, i8 38, <16 x i1> %m, i32 %n)
				%r1 = call <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> %f0, <16 x float> %f1, i8 10, <16 x i1> %m, i32 %n)
				ret void
				}

				; integer arith
				declare <8 x i32> @llvm.vp.add.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.sub.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.mul.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.sdiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.srem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.udiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.urem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				; bit arith
				declare <8 x i32> @llvm.vp.and.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.xor.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.or.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.ashr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.lshr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)
				declare <8 x i32> @llvm.vp.shl.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen)

				; floating point arith
				declare <8 x double> @llvm.vp.fadd.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fsub.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fmul.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fdiv.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.frem.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fma.v8f64(<8 x double>, <8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fneg.v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.minnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.maxnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen)

				; cast & conversions
				declare <8 x i64> @llvm.vp.fptosi.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x i64> @llvm.vp.fptoui.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.uitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.rint.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.round.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.nearbyint.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.ceil.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.floor.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.trunc.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x float> @llvm.vp.fptrunc.v8f32v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.fpext.v8f64v8f32(<8 x float> %x, metadata, <8 x i1> mask, i32 vlen)

				; math ops
				declare <8 x double> @llvm.vp.pow.v8f64(<8 x double> %x, <8 x double> %y, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.powi.v8f64(<8 x double> %x, i32 %y, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sqrt.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.sin.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.cos.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log10.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.log2.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.exp.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)
				declare <8 x double> @llvm.vp.exp2.v8f64(<8 x double> %x, metadata, metadata, <8 x i1> mask, i32 vlen)

				; memory
				declare void @llvm.vp.store.v16i32.p0v16i32(<16 x i32>, <16 x i32>*, <16 x i1> mask, i32 vlen)
				declare <16 x i32> @llvm.vp.load.v16i32.p0v16i32(<16 x i32>*, <16 x i1> mask, i32 vlen)
				declare void @llvm.vp.scatter.v16i32.v16p0i32(<16 x i32>, <16 x i32*>, <16 x i1> mask, i32 vlen)
				declare <16 x i32> @llvm.vp.gather.v16i32.v16p0i32(<16 x i32*>, <16 x i1> mask, i32 vlen)

				; reductions
				declare float @llvm.vp.reduce.fadd.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmul.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmin.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)
				declare float @llvm.vp.reduce.fmax.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.add.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.mul.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.and.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.xor.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.or.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen)
				declare i32 @llvm.vp.reduce.smax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.smin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.umax.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)
				declare i32 @llvm.vp.reduce.umin.v16i32(<16 x i32> %v, <16 x i1> %m, i32 %n)

				; shuffles
				declare <16 x float> @llvm.vp.select.v16f32(<16 x i1>, <16 x float>, <16 x float>, i32 vlen)
				declare <16 x float> @llvm.vp.compose.v16f32(<16 x float>, <16 x float>, i32, i32 vlen)
				declare <16 x float> @llvm.vp.vshift.v16f32(<16 x float>, i32, <16 x i1>, i32 vlen)
				declare <16 x float> @llvm.vp.compress.v16f32(<16 x float>, <16 x i1>, i32 vlen)
				declare <16 x float> @llvm.vp.expand.v16f32(<16 x float>, <16 x i1> mask, i32 vlen)

				; icmp , fcmp
				declare <16 x i1> @llvm.vp.icmp.v16i32(<16 x i32>, <16 x i32>, i8, <16 x i1> mask, i32 vlen)
				declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float>, <16 x float>, i8, <16 x i1> mask, i32 vlen)

llvm/test/Verifier/vp_attributes.ll

This file was added.

				; RUN: not llvm-as %s -o /dev/null 2>&1 \| FileCheck %s

				declare void @a(<16 x i1> mask %a, <16 x i1> mask %b)
				; CHECK: Cannot have multiple 'mask' parameters!

				declare void @b(<16 x i1> mask %a, i32 vlen %x, i32 vlen %y)
				; CHECK: Cannot have multiple 'vlen' parameters!

				declare <16 x double> @c(<16 x double> passthru %a)
				; CHECK: Cannot have 'passthru' parameter without 'mask' parameter!

				declare <16 x double> @d(<16 x double> passthru %a, <16 x i1> mask %M, <16 x double> passthru %b)
				; CHECK: Cannot have multiple 'passthru' parameters!

llvm/tools/llc/llc.cpp

Show First 20 Lines • Show All 309 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeEntryExitInstrumenterPass(*Registry);		initializeEntryExitInstrumenterPass(*Registry);
initializePostInlineEntryExitInstrumenterPass(*Registry);		initializePostInlineEntryExitInstrumenterPass(*Registry);
initializeUnreachableBlockElimLegacyPassPass(*Registry);		initializeUnreachableBlockElimLegacyPassPass(*Registry);
initializeConstantHoistingLegacyPassPass(*Registry);		initializeConstantHoistingLegacyPassPass(*Registry);
initializeScalarOpts(*Registry);		initializeScalarOpts(*Registry);
initializeVectorization(*Registry);		initializeVectorization(*Registry);
initializeScalarizeMaskedMemIntrinPass(*Registry);		initializeScalarizeMaskedMemIntrinPass(*Registry);
initializeExpandReductionsPass(*Registry);		initializeExpandReductionsPass(*Registry);
		initializeExpandVectorPredicationPass(*Registry);
initializeHardwareLoopsPass(*Registry);		initializeHardwareLoopsPass(*Registry);

// Initialize debugging passes.		// Initialize debugging passes.
initializeScavengerTestPass(*Registry);		initializeScavengerTestPass(*Registry);

// Register the target printer for --version.		// Register the target printer for --version.
cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);		cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);

▲ Show 20 Lines • Show All 326 Lines • Show Last 20 Lines

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
initializeGlobalMergePass(Registry);		initializeGlobalMergePass(Registry);
initializeIndirectBrExpandPassPass(Registry);		initializeIndirectBrExpandPassPass(Registry);
initializeInterleavedLoadCombinePass(Registry);		initializeInterleavedLoadCombinePass(Registry);
initializeInterleavedAccessPass(Registry);		initializeInterleavedAccessPass(Registry);
initializeEntryExitInstrumenterPass(Registry);		initializeEntryExitInstrumenterPass(Registry);
initializePostInlineEntryExitInstrumenterPass(Registry);		initializePostInlineEntryExitInstrumenterPass(Registry);
initializeUnreachableBlockElimLegacyPassPass(Registry);		initializeUnreachableBlockElimLegacyPassPass(Registry);
initializeExpandReductionsPass(Registry);		initializeExpandReductionsPass(Registry);
		initializeExpandVectorPredicationPass(Registry);
initializeWasmEHPreparePass(Registry);		initializeWasmEHPreparePass(Registry);
initializeWriteBitcodePassPass(Registry);		initializeWriteBitcodePassPass(Registry);
initializeHardwareLoopsPass(Registry);		initializeHardwareLoopsPass(Registry);
initializeTypePromotionPass(Registry);		initializeTypePromotionPass(Registry);

#ifdef BUILD_EXAMPLES		#ifdef BUILD_EXAMPLES
initializeExampleIRTransforms(Registry);		initializeExampleIRTransforms(Registry);
#endif		#endif
▲ Show 20 Lines • Show All 419 Lines • Show Last 20 Lines

llvm/unittests/IR/CMakeLists.txt

Show All 34 Lines	add_llvm_unittest(IRTests
TypesTest.cpp		TypesTest.cpp
UseTest.cpp		UseTest.cpp
UserTest.cpp		UserTest.cpp
ValueHandleTest.cpp		ValueHandleTest.cpp
ValueMapTest.cpp		ValueMapTest.cpp
ValueTest.cpp		ValueTest.cpp
VectorTypesTest.cpp		VectorTypesTest.cpp
VerifierTest.cpp		VerifierTest.cpp
		VPIntrinsicTest.cpp
WaymarkTest.cpp		WaymarkTest.cpp
)		)

target_link_libraries(IRTests PRIVATE LLVMTestingSupport)		target_link_libraries(IRTests PRIVATE LLVMTestingSupport)

llvm/unittests/IR/VPIntrinsicTest.cpp

This file was added.

				//===- VPIntrinsicTest.cpp - VPIntrinsic unit tests ---------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/SmallVector.h"
				#include "llvm/AsmParser/Parser.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/IR/Module.h"
				#include "llvm/IR/Verifier.h"
				#include "llvm/Support/SourceMgr.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				namespace {

				class VPIntrinsicTest : public testing::Test {
				protected:
				LLVMContext Context;

				VPIntrinsicTest() : Context() {}

				LLVMContext C;
				SMDiagnostic Err;

				std::unique_ptr<Module> CreateVPDeclarationModule() {
				return parseAssemblyString(
				" declare <8 x double> @llvm.vp.fadd.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fsub.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fmul.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fdiv.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.frem.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fma.v8f64(<8 x double>, <8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fneg.v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.minnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.maxnum.v8f64(<8 x double>, <8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.add.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.sub.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.mul.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.sdiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.srem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.udiv.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.urem.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.and.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.xor.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.or.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.ashr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.lshr.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare <8 x i32> @llvm.vp.shl.v8i32(<8 x i32>, <8 x i32>, <8 x i1> mask, i32 vlen) "
				" declare void @llvm.vp.store.v16i32.p0v16i32(<16 x i32>, <16 x i32>*, <16 x i1> mask, i32 vlen) "
				" declare void @llvm.vp.scatter.v16i32.v16p0i32(<16 x i32>, <16 x i32*>, <16 x i1> mask, i32 vlen) "
				" declare <16 x i32> @llvm.vp.load.v16i32.p0v16i32(<16 x i32>*, <16 x i1> mask, i32 vlen) "
				" declare <16 x i32> @llvm.vp.gather.v16i32.v16p0i32(<16 x i32*>, <16 x i1> mask, i32 vlen) "
				" declare float @llvm.vp.reduce.fadd.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen) "
				" declare float @llvm.vp.reduce.fmul.v16f32(float, <16 x float>, <16 x i1> mask, i32 vlen) "
				" declare float @llvm.vp.reduce.fmin.v16f32(<16 x float>, <16 x i1> mask, i32 vlen) "
				" declare float @llvm.vp.reduce.fmax.v16f32(<16 x float>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.add.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.mul.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.and.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.xor.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.or.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.smin.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.smax.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.umin.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare i32 @llvm.vp.reduce.umax.v16i32(<16 x i32>, <16 x i1> mask, i32 vlen) "
				" declare <16 x float> @llvm.vp.select.v16f32(<16 x i1>, <16 x float>, <16 x float>, i32 vlen) "
				" declare <16 x float> @llvm.vp.compose.v16f32(<16 x float>, <16 x float>, i32, i32 vlen) "
				" declare <16 x float> @llvm.vp.vshift.v16f32(<16 x float>, i32, <16 x i1>, i32 vlen) "
				" declare <16 x float> @llvm.vp.compress.v16f32(<16 x float>, <16 x i1>, i32 vlen) "
				" declare <16 x float> @llvm.vp.expand.v16f32(<16 x float>, <16 x i1> mask, i32 vlen) "
				" declare <16 x i1> @llvm.vp.icmp.v16i32(<16 x i32>, <16 x i32>, i8, <16 x i1> mask, i32 vlen) "
				" declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float>, <16 x float>, i8, <16 x i1> mask, i32 vlen) "
				" declare <8 x i64> @llvm.vp.fptosi.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x i64> @llvm.vp.fptoui.v8i64v8f64(<8 x double>, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.sitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.uitofp.v8f64v8i64(<8 x i64>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.rint.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.round.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.nearbyint.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.ceil.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.floor.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.trunc.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x float> @llvm.vp.fptrunc.v8f32v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.fpext.v8f64v8f32(<8 x float>, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.pow.v8f64(<8 x double>, <8 x double> %y, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.powi.v8f64(<8 x double>, i32 %y, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.sqrt.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.sin.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.cos.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.log.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.log10.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.log2.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.exp.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) "
				" declare <8 x double> @llvm.vp.exp2.v8f64(<8 x double>, metadata, metadata, <8 x i1> mask, i32 vlen) ",
				Err, C);
				}
				};

				/// Check that VPIntrinsic:canIgnoreVectorLengthParam() returns true
				/// if the vector length parameter does not mask-off any lanes.
				TEST_F(VPIntrinsicTest, CanIgnoreVectorLength) {
				LLVMContext C;
				SMDiagnostic Err;

				std::unique_ptr<Module> M =
				parseAssemblyString(
				"declare <256 x i64> @llvm.vp.mul.v256i64(<256 x i64>, <256 x i64>, <256 x i1>, i32)"
				"declare <vscale x 2 x i64> @llvm.vp.mul.nxv2i64(<vscale x 2 x i64>, <vscale x 2 x i64>, <vscale x 2 x i1>, i32)"
				"define void @test_static_vlen( "
				" <256 x i64> %i0, <vscale x 2 x i64> %si0,"
				" <256 x i64> %i1, <vscale x 2 x i64> %si1,"
				" <256 x i1> %m, <vscale x 2 x i1> %sm, i32 %vl) { "
				" %r0 = call <256 x i64> @llvm.vp.mul.v256i64(<256 x i64> %i0, <256 x i64> %i1, <256 x i1> %m, i32 %vl)"
				" %r1 = call <256 x i64> @llvm.vp.mul.v256i64(<256 x i64> %i0, <256 x i64> %i1, <256 x i1> %m, i32 256)"
				" %r2 = call <256 x i64> @llvm.vp.mul.v256i64(<256 x i64> %i0, <256 x i64> %i1, <256 x i1> %m, i32 0)"
				" %r3 = call <256 x i64> @llvm.vp.mul.v256i64(<256 x i64> %i0, <256 x i64> %i1, <256 x i1> %m, i32 -1)"
				" %r4 = call <256 x i64> @llvm.vp.mul.v256i64(<256 x i64> %i0, <256 x i64> %i1, <256 x i1> %m, i32 123)"
				" %r5 = call <vscale x 2 x i64> @llvm.vp.mul.nxv2i64(<vscale x 2 x i64> %si0, <vscale x 2 x i64> %si1, <vscale x 2 x i1> %sm, i32 -1)"
				" %r6 = call <vscale x 2 x i64> @llvm.vp.mul.nxv2i64(<vscale x 2 x i64> %si0, <vscale x 2 x i64> %si1, <vscale x 2 x i1> %sm, i32 99999)"
				" ret void "
				"}",
				Err, C);

				auto *F = M->getFunction("test_static_vlen");
				assert(F);

				const int NumExpected = 7;
				const bool Expected[] = {false, true, false, true, false, true, false};
				int i = 0;
				for (auto &I : F->getEntryBlock()) {
				VPIntrinsic *VPI = dyn_cast<VPIntrinsic>(&I);
				if (!VPI) {
				ASSERT_TRUE(I.isTerminator());
				continue;
				}

				ASSERT_LT(i, NumExpected);
				ASSERT_EQ(Expected[i], VPI->canIgnoreVectorLengthParam());
				++i;
				}
				}

				/// Check that the argument returned by
				/// VPIntrinsic::Get<X>ParamPos(Intrinsic::ID) has the expected type.
				TEST_F(VPIntrinsicTest, GetParamPos) {
				std::unique_ptr<Module> M = CreateVPDeclarationModule();
				assert(M);

				for (Function &F : *M) {
				ASSERT_TRUE(F.isIntrinsic());
				Optional<int> MaskParamPos =
				VPIntrinsic::GetMaskParamPos(F.getIntrinsicID());
				if (MaskParamPos.hasValue()) {
				Type *MaskParamType = F.getArg(MaskParamPos.getValue())->getType();
				ASSERT_TRUE(MaskParamType->isVectorTy());
				ASSERT_TRUE(MaskParamType->getVectorElementType()->isIntegerTy(1));
				}

				Optional<int> VecLenParamPos =
				VPIntrinsic::GetVectorLengthParamPos(F.getIntrinsicID());
				if (VecLenParamPos.hasValue()) {
				Type *VecLenParamType = F.getArg(VecLenParamPos.getValue())->getType();
				ASSERT_TRUE(VecLenParamType->isIntegerTy(32));
				}

				Optional<int> MemPtrParamPos = VPIntrinsic::GetMemoryPointerParamPos(F.getIntrinsicID());
				if (MemPtrParamPos.hasValue()) {
				Type *MemPtrParamType = F.getArg(MemPtrParamPos.getValue())->getType();
				ASSERT_TRUE(MemPtrParamType->isPtrOrPtrVectorTy());
				}

				Optional<int> RoundingParamPos = VPIntrinsic::GetRoundingModeParamPos(F.getIntrinsicID());
				if (RoundingParamPos.hasValue()) {
				Type *RoundingParamType = F.getArg(RoundingParamPos.getValue())->getType();
				ASSERT_TRUE(RoundingParamType->isMetadataTy());
				}

				Optional<int> ExceptParamPos = VPIntrinsic::GetExceptionBehaviorParamPos(F.getIntrinsicID());
				if (ExceptParamPos.hasValue()) {
				Type *ExceptParamType = F.getArg(ExceptParamPos.getValue())->getType();
				ASSERT_TRUE(ExceptParamType->isMetadataTy());
				}
				}
				}

				/// Check that going from Opcode to VP intrinsic and back results in the same
				/// Opcode.
				TEST_F(VPIntrinsicTest, OpcodeRoundTrip) {
				std::vector<unsigned> Opcodes;
				Opcodes.reserve(100);

				{
				#define HANDLE_INST(OCNum, OCName, Class) Opcodes.push_back(OCNum);
				#include "llvm/IR/Instruction.def"
				}

				unsigned FullTripCounts = 0;
				for (unsigned OC : Opcodes) {
				Intrinsic::ID VPID = VPIntrinsic::GetForOpcode(OC);
				// no equivalent VP intrinsic available
				if (VPID == Intrinsic::not_intrinsic)
				continue;

				unsigned RoundTripOC = VPIntrinsic::GetFunctionalOpcodeForVP(VPID);
				// no equivalent Opcode available
				if (RoundTripOC == Instruction::Call)
				continue;

				ASSERT_EQ(RoundTripOC, OC);
				++FullTripCounts;
				}
				ASSERT_NE(FullTripCounts, 0u);
				}

				} // end anonymous namespace

llvm/utils/TableGen/CodeGenIntrinsics.h

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	struct CodeGenIntrinsic {
bool canThrow;		bool canThrow;

/// True if the intrinsic is marked as noduplicate.		/// True if the intrinsic is marked as noduplicate.
bool isNoDuplicate;		bool isNoDuplicate;

/// True if the intrinsic is no-return.		/// True if the intrinsic is no-return.
bool isNoReturn;		bool isNoReturn;

		/// True if the intrinsic is no-sync.
		bool isNoSync;

/// True if the intrinsic is will-return.		/// True if the intrinsic is will-return.
bool isWillReturn;		bool isWillReturn;

/// True if the intrinsic is cold.		/// True if the intrinsic is cold.
bool isCold;		bool isCold;

/// True if the intrinsic is marked as convergent.		/// True if the intrinsic is marked as convergent.
bool isConvergent;		bool isConvergent;

/// True if the intrinsic has side effects that aren't captured by any		/// True if the intrinsic has side effects that aren't captured by any
/// of the other flags.		/// of the other flags.
bool hasSideEffects;		bool hasSideEffects;

// True if the intrinsic is marked as speculatable.		// True if the intrinsic is marked as speculatable.
bool isSpeculatable;		bool isSpeculatable;

enum ArgAttribute {		enum ArgAttribute {
NoCapture,		NoCapture,
NoAlias,		NoAlias,
Returned,		Returned,
ReadOnly,		ReadOnly,
WriteOnly,		WriteOnly,
ReadNone,		ReadNone,
ImmArg		ImmArg,
		Mask,
		VectorLength,
		Passthru
};		};

std::vector<std::pair<unsigned, ArgAttribute>> ArgumentAttributes;		std::vector<std::pair<unsigned, ArgAttribute>> ArgumentAttributes;

bool hasProperty(enum SDNP Prop) const {		bool hasProperty(enum SDNP Prop) const {
return Properties & (1 << Prop);		return Properties & (1 << Prop);
}		}

Show All 36 Lines

llvm/utils/TableGen/CodeGenTarget.cpp

Show First 20 Lines • Show All 601 Lines • ▼ Show 20 Lines	CodeGenIntrinsic::CodeGenIntrinsic(Record *R) {
std::string DefName = std::string(R->getName());		std::string DefName = std::string(R->getName());
ArrayRef<SMLoc> DefLoc = R->getLoc();		ArrayRef<SMLoc> DefLoc = R->getLoc();
ModRef = ReadWriteMem;		ModRef = ReadWriteMem;
Properties = 0;		Properties = 0;
isOverloaded = false;		isOverloaded = false;
isCommutative = false;		isCommutative = false;
canThrow = false;		canThrow = false;
isNoReturn = false;		isNoReturn = false;
		isNoSync = false;
isWillReturn = false;		isWillReturn = false;
isCold = false;		isCold = false;
isNoDuplicate = false;		isNoDuplicate = false;
isConvergent = false;		isConvergent = false;
isSpeculatable = false;		isSpeculatable = false;
hasSideEffects = false;		hasSideEffects = false;

if (DefName.size() <= 4 \|\|		if (DefName.size() <= 4 \|\|
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	if (TyEl->isSubClassOf("LLVMMatchType")) {
PrintFatalError(DefLoc,		PrintFatalError(DefLoc,
Twine("ParamTypes is ") + TypeList->getAsString());		Twine("ParamTypes is ") + TypeList->getAsString());
}		}
VT = OverloadedVTs[MatchTy];		VT = OverloadedVTs[MatchTy];
// It only makes sense to use the extended and truncated vector element		// It only makes sense to use the extended and truncated vector element
// variants with iAny types; otherwise, if the intrinsic is not		// variants with iAny types; otherwise, if the intrinsic is not
// overloaded, all the types can be specified directly.		// overloaded, all the types can be specified directly.
assert(((!TyEl->isSubClassOf("LLVMExtendedType") &&		assert(((!TyEl->isSubClassOf("LLVMExtendedType") &&
!TyEl->isSubClassOf("LLVMTruncatedType") &&		!TyEl->isSubClassOf("LLVMTruncatedType")) \|\|
!TyEl->isSubClassOf("LLVMScalarOrSameVectorWidth")) \|\|
VT == MVT::iAny \|\| VT == MVT::vAny) &&		VT == MVT::iAny \|\| VT == MVT::vAny) &&
"Expected iAny or vAny type");		"Expected iAny or vAny type");
} else		} else
VT = getValueType(TyEl->getValueAsDef("VT"));		VT = getValueType(TyEl->getValueAsDef("VT"));

// Reject invalid types.		// Reject invalid types.
if (VT == MVT::isVoid && i != e-1 /void at end means varargs/)		if (VT == MVT::isVoid && i != e-1 /void at end means varargs/)
PrintFatalError(DefLoc, "Intrinsic '" + DefName +		PrintFatalError(DefLoc, "Intrinsic '" + DefName +
Show All 28 Lines	for (unsigned i = 0, e = PropList->size(); i != e; ++i) {
else if (Property->getName() == "Throws")		else if (Property->getName() == "Throws")
canThrow = true;		canThrow = true;
else if (Property->getName() == "IntrNoDuplicate")		else if (Property->getName() == "IntrNoDuplicate")
isNoDuplicate = true;		isNoDuplicate = true;
else if (Property->getName() == "IntrConvergent")		else if (Property->getName() == "IntrConvergent")
isConvergent = true;		isConvergent = true;
else if (Property->getName() == "IntrNoReturn")		else if (Property->getName() == "IntrNoReturn")
isNoReturn = true;		isNoReturn = true;
		else if (Property->getName() == "IntrNoSync")
		isNoSync = true;
else if (Property->getName() == "IntrWillReturn")		else if (Property->getName() == "IntrWillReturn")
isWillReturn = true;		isWillReturn = true;
else if (Property->getName() == "IntrCold")		else if (Property->getName() == "IntrCold")
isCold = true;		isCold = true;
else if (Property->getName() == "IntrSpeculatable")		else if (Property->getName() == "IntrSpeculatable")
isSpeculatable = true;		isSpeculatable = true;
else if (Property->getName() == "IntrHasSideEffects")		else if (Property->getName() == "IntrHasSideEffects")
hasSideEffects = true;		hasSideEffects = true;
else if (Property->isSubClassOf("NoCapture")) {		else if (Property->isSubClassOf("NoCapture")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, NoCapture));		ArgumentAttributes.push_back(std::make_pair(ArgNo, NoCapture));
} else if (Property->isSubClassOf("NoAlias")) {		} else if (Property->isSubClassOf("NoAlias")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, NoAlias));		ArgumentAttributes.push_back(std::make_pair(ArgNo, NoAlias));
} else if (Property->isSubClassOf("Returned")) {		} else if (Property->isSubClassOf("Returned")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, Returned));		ArgumentAttributes.push_back(std::make_pair(ArgNo, Returned));
		} else if (Property->isSubClassOf("VectorLength")) {
		unsigned ArgNo = Property->getValueAsInt("ArgNo");
		ArgumentAttributes.push_back(std::make_pair(ArgNo, VectorLength));
		} else if (Property->isSubClassOf("Mask")) {
		unsigned ArgNo = Property->getValueAsInt("ArgNo");
		ArgumentAttributes.push_back(std::make_pair(ArgNo, Mask));
		} else if (Property->isSubClassOf("Passthru")) {
		unsigned ArgNo = Property->getValueAsInt("ArgNo");
		ArgumentAttributes.push_back(std::make_pair(ArgNo, Passthru));
} else if (Property->isSubClassOf("ReadOnly")) {		} else if (Property->isSubClassOf("ReadOnly")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, ReadOnly));		ArgumentAttributes.push_back(std::make_pair(ArgNo, ReadOnly));
} else if (Property->isSubClassOf("WriteOnly")) {		} else if (Property->isSubClassOf("WriteOnly")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, WriteOnly));		ArgumentAttributes.push_back(std::make_pair(ArgNo, WriteOnly));
} else if (Property->isSubClassOf("ReadNone")) {		} else if (Property->isSubClassOf("ReadNone")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
Show All 27 Lines

llvm/utils/TableGen/IntrinsicEmitter.cpp

Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	if (L->canThrow != R->canThrow)
return R->canThrow;		return R->canThrow;

if (L->isNoDuplicate != R->isNoDuplicate)		if (L->isNoDuplicate != R->isNoDuplicate)
return R->isNoDuplicate;		return R->isNoDuplicate;

if (L->isNoReturn != R->isNoReturn)		if (L->isNoReturn != R->isNoReturn)
return R->isNoReturn;		return R->isNoReturn;

		if (L->isNoSync != R->isNoSync)
		return R->isNoSync;

if (L->isWillReturn != R->isWillReturn)		if (L->isWillReturn != R->isWillReturn)
return R->isWillReturn;		return R->isWillReturn;

if (L->isCold != R->isCold)		if (L->isCold != R->isCold)
return R->isCold;		return R->isCold;

if (L->isConvergent != R->isConvergent)		if (L->isConvergent != R->isConvergent)
return R->isConvergent;		return R->isConvergent;
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	if (ae) {
addComma = true;		addComma = true;
break;		break;
case CodeGenIntrinsic::Returned:		case CodeGenIntrinsic::Returned:
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::Returned";		OS << "Attribute::Returned";
addComma = true;		addComma = true;
break;		break;
		case CodeGenIntrinsic::VectorLength:
		if (addComma)
		OS << ",";
		OS << "Attribute::VectorLength";
		addComma = true;
		break;
		case CodeGenIntrinsic::Mask:
		if (addComma)
		OS << ",";
		OS << "Attribute::Mask";
		addComma = true;
		break;
		case CodeGenIntrinsic::Passthru:
		if (addComma)
		OS << ",";
		OS << "Attribute::Passthru";
		addComma = true;
		break;
case CodeGenIntrinsic::ReadOnly:		case CodeGenIntrinsic::ReadOnly:
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::ReadOnly";		OS << "Attribute::ReadOnly";
addComma = true;		addComma = true;
break;		break;
case CodeGenIntrinsic::WriteOnly:		case CodeGenIntrinsic::WriteOnly:
if (addComma)		if (addComma)
Show All 19 Lines	if (ae) {
} while (ai != ae && intrinsic.ArgumentAttributes[ai].first == argNo);		} while (ai != ae && intrinsic.ArgumentAttributes[ai].first == argNo);
OS << "};\n";		OS << "};\n";
OS << " AS[" << numAttrs++ << "] = AttributeList::get(C, "		OS << " AS[" << numAttrs++ << "] = AttributeList::get(C, "
<< attrIdx << ", AttrParam" << attrIdx << ");\n";		<< attrIdx << ", AttrParam" << attrIdx << ");\n";
}		}
}		}

if (!intrinsic.canThrow \|\|		if (!intrinsic.canThrow \|\|
(intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem && !intrinsic.hasSideEffects) \|\|		(intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem && !intrinsic.hasSideEffects) \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - (intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem && !intrinsic.hasSideEffects) \|\| + (intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem && + !intrinsic.hasSideEffects) \|\| Lint: Pre-merge checks: clang-format: please reformat the code ``` - (intrinsic.ModRef != CodeGenIntrinsic…
intrinsic.isNoReturn \|\| intrinsic.isWillReturn \|\| intrinsic.isCold \|\|		intrinsic.isNoReturn \|\| intrinsic.isNoSync \|\| intrinsic.isWillReturn \|\|
intrinsic.isNoDuplicate \|\| intrinsic.isConvergent \|\|		intrinsic.isCold \|\| intrinsic.isNoDuplicate \|\| intrinsic.isConvergent \|\|
intrinsic.isSpeculatable) {		intrinsic.isSpeculatable) {
OS << " const Attribute::AttrKind Atts[] = {";		OS << " const Attribute::AttrKind Atts[] = {";
bool addComma = false;		bool addComma = false;
if (!intrinsic.canThrow) {		if (!intrinsic.canThrow) {
OS << "Attribute::NoUnwind";		OS << "Attribute::NoUnwind";
addComma = true;		addComma = true;
}		}
if (intrinsic.isNoReturn) {		if (intrinsic.isNoReturn) {
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::NoReturn";		OS << "Attribute::NoReturn";
addComma = true;		addComma = true;
}		}
		if (intrinsic.isNoSync) {
		if (addComma)
		OS << ",";
		OS << "Attribute::NoSync";
		addComma = true;
		}
if (intrinsic.isWillReturn) {		if (intrinsic.isWillReturn) {
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::WillReturn";		OS << "Attribute::WillReturn";
addComma = true;		addComma = true;
}		}
if (intrinsic.isCold) {		if (intrinsic.isCold) {
if (addComma)		if (addComma)
▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

RFC: Prototype & Roadmap for vector predication in LLVMChanges PlannedPublic

Details

Vector Predication Roadmap

Vector Predication intrinsics

Roadmap

References

Diff Detail

Unit TestsFailed

Event Timeline

Updates

Cross references

Updates

Updates

Updates

Updates

Planned

Updates

Observations

Next steps

Integer slice patches

Changes

Changes required going from passthru to select:

Changes required going from select to passthru:

Changes required going from passthru to select:

Changes required going from select to passthru:

TODO:

Revision Contents

Diff 246280

llvm/docs/LangRef.rst

llvm/include/llvm/Analysis/InstructionSimplify.h

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/Bitcode/LLVMBitCodes.h

llvm/include/llvm/CodeGen/ExpandVectorPredication.h

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/CodeGen/SelectionDAG.h

llvm/include/llvm/CodeGen/SelectionDAGNodes.h

llvm/include/llvm/IR/Attributes.td

llvm/include/llvm/IR/FPEnv.h

llvm/include/llvm/IR/IRBuilder.h

llvm/include/llvm/IR/IntrinsicInst.h

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/IR/MatcherCast.h

llvm/include/llvm/IR/PatternMatch.h

llvm/include/llvm/IR/PredicatedInst.h

llvm/include/llvm/IR/VPBuilder.h

llvm/include/llvm/IR/VPIntrinsics.def

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/Target/TargetSelectionDAG.td

llvm/lib/Analysis/InstructionSimplify.cpp

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/AsmParser/LLLexer.cpp

llvm/lib/AsmParser/LLParser.cpp

llvm/lib/AsmParser/LLToken.h

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/ExpandVectorPredication.cpp

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/lib/IR/Attributes.cpp

llvm/lib/IR/CMakeLists.txt

llvm/lib/IR/FPEnv.cpp

llvm/lib/IR/IRBuilder.cpp

llvm/lib/IR/IntrinsicInst.cpp

llvm/lib/IR/PredicatedInst.cpp

llvm/lib/IR/VPBuilder.cpp

llvm/lib/IR/Verifier.cpp

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

llvm/lib/Transforms/Utils/CodeExtractor.cpp

RFC: Prototype & Roadmap for vector predication in LLVM
Changes PlannedPublic