This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
-
PPCISelDAGToDAG.cpp
13/13
PPCISelLowering.h
29/35
PPCISelLowering.cpp
1/2
PPCInstr64Bit.td
-
PPCInstrAltivec.td
-
PPCInstrInfo.td
-
PPCInstrPrefix.td
-
PPCInstrVSX.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
-
p9-dform-load-alignment.ll

Differential D93370

[PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns.
ClosedPublic

Authored by amyk on Dec 15 2020, 10:37 PM.

Download Raw Diff

Details

Reviewers

power-llvm-team
nemanjai
bsaleil

Group Reviewers

Restricted Project

Commits

rG64d951be61aa: [PowerPC] Add new infrastructure to select load/store instructions, update…

Summary

This patch introduces a new infrastructure that is used to select the load and store instructions in the PPC backend.

The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen.
Given this limitation, we are not able to easily and reliably generate the P10 prefixed load and stores instructions (such as when
the immediates that fit within 34-bits). This refactoring is meant to provide us with more control over the patterns/different forms to exploit,
as well as eliminating dependency of pattern declaration in TableGen.

The idea of this refactoring is that it introduces a set of addressing modes that correspond to different instruction formats
of a particular load and store instruction, along with a set of common flags that describes a load/store. Whenever a load/store
instruction is being selected, we analyze the instruction and compute a set of flags for it. The computed flags are then used to
select the most optimal load/store addressing mode.

This computation of flags is done in computeMOFlags(), while selecting the optimal addressing mode is done through
getAddrModeForFlags(); in which this functions searches for a set of address flags stored in a map that relates common flags to
addressing modes. Once the optimal addressing mode is determined, this information is given to SelectOptimalAddrMode(),
where we set the base and displacement of the load/store accordingly based on the addressing mode.

Another thing to note is the SelectForceXForm() function is similar to SelectAddressRegRegOnly(), with an updated naming
and a removed condition to better suit the refactoring that is being done on the loads/stores.

This patch is the first of a series of patches to be committed - it contains the initial implementation of the refactored load/store
selection infrastructure and also updates P8/P9 patterns to adopt this infrastructure. The idea is that incremental patches will
add more implementation and support, and eventually the old implementation will be removed.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

amyk created this revision.Dec 15 2020, 10:37 PM

Herald added subscribers: shchenz, jfb, hiraditya. · View Herald TranscriptDec 15 2020, 10:37 PM

amyk requested review of this revision.Dec 15 2020, 10:37 PM

Harbormaster completed remote builds in B82581: Diff 312110.Dec 15 2020, 10:38 PM

So, is this patch still incomplete as I didn't see the test change ?

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
1070	Not sure if this is handled correctly as you are removing the align restrict for LWA. Technical speaking, we need to remove all such kind of alignment restrict in the load/store as far as we did some analysis in the source code. But I notice that we are removing the align restrict here but still keep it for LD.

lkail added a subscriber: lkail.Dec 16 2020, 4:55 AM

lkail added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
2543	Should it be a `static` function?

@steven.zhang The idea is that this patch should be NFC. All existing load/store test cases should pass with this refactoring. I do think there should be more tests added, perhaps in a follow up patch. What do you think?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
2543	Good point. I will fix this.
llvm/lib/Target/PowerPC/PPCISelLowering.h
721	Add comments to describe the functions.
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
1070	You're right; I meant to remove the `align`/`unalign` in the other `LD`,`LWA` patterns, as well, as the new load/store infrastructure is meant to use load/sextload/zextload and compute alignment based on the flags.

I think it is better to explicitly give some reasons why we need this big refactoring, in other words, what's the disadvantage/limitation of legacy implementation? Thank you for the big effort.

llvm/lib/Target/PowerPC/PPCISelLowering.h
702	The following features should not be associated to one specific memory operation? Should we add them into the flag set of each memory operation in `computeMOFlags`? Is it possible to get the sub-target info when we do am selection after we get all flags?

Add more comments to functions, make provablyDisjointOr() static, fix pattern in PPCInstr64Bit.td.

All existing tests in the backend run successfully with this patch.

In D93370#2459137, @shchenz wrote:

I think it is better to explicitly give some reasons why we need this big refactoring, in other words, what's the disadvantage/limitation of legacy implementation? Thank you for the big effort.

That is a good point. Thank you for bringing this up. The primary reason behind this refactoring is that it would allow us to easily exploit prefixed load and stores when the offset is large (such as for immediates that fit within 34-bits).

The current implementation of selecting load/stores is very dependent on the ordering of patterns within TableGen, and we aren't able to easily and reliably generated the prefixed load/stores (also since those patterns would also be written in a different file).
Essentially, the refactoring of how the loads/stores are handled is meant to provide us with more control over the patterns/different forms to exploit. Additionally, we want to eliminate the dependency of pattern declaration ordering as much as possible so regardless of the order of the patterns that the compiler tries to match, we can generate the best possible code.

amyk edited the summary of this revision. (Show Details)Dec 16 2020, 5:17 PM

amyk added inline comments.Dec 16 2020, 5:26 PM

llvm/lib/Target/PowerPC/PPCISelLowering.h
702	I apologize, I think I do not quite follow. If you could clarify, that would be great. Are you suggesting that these should not be flags, but more of Subtarget checks after the flags are computed? I think it can be useful to store the subtarget information in the set of flags is that we can easily know which instructions we can produce on P9 and P10. I was thinking for instance, if we have flags set for `SubtargetP9`, `ScalFlt` and `RPlusSImm16Mult4` (register + signed 16 bit immediate, multiple of 4), then we know we can generate the DS-Form (corresponding to `DFLOAD` pseudoinstruction. Or, when we have `PPC::MOF_SubtargetP10` and `PPC::MOF_RPlusSImm34` (signed 34-bit immediate), we can know we can generate prefixed load/stores.

Harbormaster completed remote builds in B82704: Diff 312325.Dec 16 2020, 5:51 PM

The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen.

I think this also can be resolved by AddedComplexity in td files? Is it convenient to solve the issue in D91279 when the worst case happens in instruction selection for load/store. With legacy infra, we have to select a xform with one zero register operand for worst case. But seems on P10, we prefer a dform with 0 offset, it is good for some linker opt. That issue does not have a simple fix with legacy infra because the worst case is handled in select xform only function, we must select a xform, so we must use zero register. Could it be solved easier under the new infra? Thanks.

llvm/lib/Target/PowerPC/PPCISelLowering.h
702	Are you suggesting that these should not be flags, but more of Subtarget checks after the flags are computed? Yes, that was my first thought. For normal load/store instructions before ISEL, we can get type, address info(zext/sext/imm) from the instruction itself. But sub-target is not loads/stores characteristic. we can check the sub-target info when we do the address mode selection when we have other flags. But I am ok if making the sub-target as a flag for load/store for implementation convenience.

In D93370#2458401, @amyk wrote:

@steven.zhang The idea is that this patch should be NFC. All existing load/store test cases should pass with this refactoring. I do think there should be more tests added, perhaps in a follow up patch. What do you think?

Yeah, you'd better tag the revision as NFC if it is.

Just my general $0.02 regarding this refactoring effort...

The existing infrastructure was fine for quite a while. We had D-Form, DS-Form and X-Form which made things reasonably simple. However, we started adding new addressing forms - DQ-Form, MLS:D-Form, 8LS:D-Form, etc. which really started adding complexity to the existing infrastructure that was difficult to understand. Ultimately, the patterns were asking the wrong questions that required an increasing amount of context to answer. Questions like "is this a reg+reg operation with an immediate displacement that is a multiple of 4/16" (SelectAddrIdxX4/SelectAddrIdxX16). A question like that doesn't even fundamentally make sense. If the address is represented as an addition of two registers, the question of whether one of them is a multiple of anything is meaningless. Of course, we needed that in order to avoid selecting an X-Form when a D[SQ]-Form is available, but that is still a weird way to structure a query.
Ultimately, selection of memory access instructions wants to know one thing - "What is the optimal addressing form for this access?" So when selecting an instruction, the question is "is my addressing form the optimal one?"
That is what this refactoring aims to accomplish. So in the future, if we get new D<whatever>-Form instructions that have different requirements for alignment/displacement size/displacement alignment/etc. we should be able to easily extend this.

Furthermore, this reduces/avoids the need for hacks such as AddedComplexity and CodeSize.

amyk retitled this revision from [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns. to [PowerPC][NFC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns..Dec 17 2020, 7:04 AM

I just had some minor comments. I think it makes sense overall.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
2529	I find it odd that you are zero extending this (ie unsigned) and then casting it to a signed. You can get into all kinds of issues with this kind of thing. For example: int16_t a = -1; // This is 0xFFFF int32_t b = zeroExtend(a); // This is 0x0000FFFF (Not -1 but 65535) It is important to think about what the possible value types are for `N`. If the only possible types are `MVT::i32` and `MVT::i64` then we are fine.
2533	You can probably simplify this a little bit. See what others think too... Imm = (int32_t)cast<ConstantSDNode>(N)->getZExtValue(); int64_t Imm64 = (int64_t)cast<ConstantSDNode>(N)->getZExtValue(); return isInt<32>(Imm64);
17034	For this case it may be easier to just work with the APInt instead of creating your own functions. So, you may not need `isIntS32Immediate` and `isIntS34Immediate`. Especially since you don't use Imm34.

I should have said this in my first comments. I like this refactor. I just want to make sure I understand this refactoring more clear^_^. Thanks for your detailed explanation. @nemanjai

For legacy infra, I have one pain for the matching efficiency. When we match a worst-case in td files, we will
1: first check dform candidate and in dform candidate handling function SelectAddressRegImm, we check xform candidate. If it fails:
2: check xform candidate and in xform candidate handling function SelectAddressRegReg, we will check dform candidate. If it fails:
3: check xform only candidate and in xform only candidate handling function SelectAddressRegRegOnly, we will get the desired hardware instructions.

I think there must be logic redundant. We check again and again for the load/store address mode.

So for the new infra, I am thinking should we avoid this redundant logic? I still see redundant checks/flags collects for one load/store instruction. We first check SelectDSForm and then check SelectXForm and then SelectForceXForm according to the order in td files. In each SelectXXXForm(), we will collect the flags once. There should also be some redundant logics?

Can we select the address mode not starting from td files? Instead, we start the selection from cpp file for IR level load? For example in function PPCDAGToDAGISel::Select(), change case ISD::LOAD and inside the case, we call SelectOptimalAddrMode and then select the PPC instruction directly in cpp files.

In D93370#2461971, @shchenz wrote:

So for the new infra, I am thinking should we avoid this redundant logic? I still see redundant checks/flags collects for one load/store instruction. We first check SelectDSForm and then check SelectXForm and then SelectForceXForm according to the order in td files. In each SelectXXXForm(), we will collect the flags once. There should also be some redundant logics?

Can we select the address mode not starting from td files? Instead, we start the selection from cpp file for IR level load? For example in function PPCDAGToDAGISel::Select(), change case ISD::LOAD and inside the case, we call SelectOptimalAddrMode and then select the PPC instruction directly in cpp files.

I think this may be an interesting follow-up to optimize this slightly. However, I would imagine we would do this in PreprocessISelDAG(). Namely, we can replace ISD::LOAD with something like PPCISD::LOAD_W_ADDRMODE (and similarly for the stores). That way we can still specify more complex patterns in .td files rather than reimplementing everything in C++ code. Namely, we can still add things like:

def : Pat<(f128 (sint_to_fp (i64 (load xaddrX4:$src)))),
          (f128 (XSCVSDQP (LXSDX xaddrX4:$src)))>;

except it would be something like:

def : Pat<(f128 (sint_to_fp (i64 (PPCloadWAddForm DSForm:$src)))),
          (f128 (XSCVSDQP (LXSD DSForm:$src)))>;

However, considering the number of loads/stores in a typical DAG and how quickly these flags, etc. can be computed, this isn't a huge compile time improvement.

Updated names of the selection functions in the td patterns
Address the comment of using APInt when computing address flags
Removed NFC from the title as there is one test case update that we expect a DSForm in (instead of an XForm instruction)

Harbormaster completed remote builds in B84329: Diff 315122.Jan 7 2021, 6:27 AM

Removed the addition of isIntS32Immediate() - it is no longer needed after addressing the comment of using APInt.

amyk added a child revision: D94498: [PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation.Jan 12 2021, 7:58 AM

Thanks a lot for working on that Amy ! I have some comments on the patch.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16959	`The address flags are are stored` -> `The address flags are stored`
16966–16967	I think you can directly use the `Subtarget` field of the `PPCTargetLowering` class instead of using a `static_cast` here.
16980	This line should be moved below just before the assert.
17027	`consstants` -> `constants`
17056–17057	Should we name this flag `MOF_NotAddNorCst` then ?
17105–17107	I know this is code we already have in `SelectAddressRegRegOnly`, but isn't that condition weird ? Or at least it doesn't match the comment above. If I understand correctly, this is the case where we get rid of the `add`, but looking at the condition, we get rid of it only if it is not an add of a value and a 16-bit signed constant or if one of the operands doesn't have a single use. Shouldn't the condition be `(N.getOpcode() == ISD::ADD && !isIntS16Immediate(N.getOperand(1), ForceXFormImm) && N.getOperand(1).hasOneUse() && N.getOperand(0).hasOneUse())` instead ?
llvm/lib/Target/PowerPC/PPCISelLowering.h
695–700	The flag names are not really uniform, I guess DWInt is for DoubleWordInt ? I think we should drop the abbreviation here so it is more clear what the flags represent. Something like: MOF_SubWordInt = 1 << 15, MOF_WordInt = 1 << 16, MOF_DoubleWordInt = 1 << 17, MOF_ScalarFloat = 1 << 18, MOF_Vector = 1 << 19, MOF_Vector256 = 1 << 20,
703	Maybe we should name this flag `MOF_SubtargetBeforeP9` or something like that, to make it clear that the flag is not set on P10.
722–723	I think the map and the function declaration should be moved with the others private fields below.
1089	This method should be private.
1094	This method should be private.

Addressed review comments:

Rename some of the MemOpFlags
Fixed typo in comments
Moved variables near their use
Made some of the methods private

amyk added a child revision: D95115: [PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests.Jan 20 2021, 10:56 PM

amyk mentioned this in D95116: [PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation.Jan 20 2021, 11:01 PM

amyk marked 10 inline comments as done.Jan 21 2021, 3:12 PM

amyk added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
17105–17107	That's a good point, Baptiste. You're right in that this is the case where we eliminate the `add`. I thought about it a bit I believe the this condition might match the comment more (only get rid of the `add` if we do not have an `add` of a value and a signed 16-bit immediate, and the two operands don't have a single use) if (N.getOpcode() == ISD::ADD && (!(isIntS16Immediate(N.getOperand(1), imm) && N.getOperand(1).hasOneUse() && N.getOperand(0).hasOneUse()))) which I believe, should then be equivalent to the condition in the code. But yes, the condition was previously in `SelectAddressRegRegOnly()` which is why I kept it. If there are any more concerns on the condition and/or comment, I can probably adjust it.

Thanks for addressing the comments, LGTM now.

This revision is now accepted and ready to land.Jan 22 2021, 3:10 PM

Update patch to rebase with latest trunk, and rearrange/clean up code slightly.

Harbormaster completed remote builds in B87518: Diff 320765.Feb 2 2021, 6:42 AM

amyk mentioned this in D96075: [PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store Implementation.Feb 4 2021, 1:20 PM

Update patch to move the check for if we have a value with an offset that fits into a 34-bit immediate into it's own condition instead of inside an else if.

Harbormaster completed remote builds in B88792: Diff 322972.Feb 11 2021, 6:31 AM

Update patch to fix a small issue when setting the Base and Disp for DForms when we have constant that fits in 32-bits.
Previously I used a uint64_t when it should have been a uint16_t.

Harbormaster completed remote builds in B89297: Diff 323854.Feb 15 2021, 6:35 PM

amyk mentioned this in D97391: [NFC][PowerPC] Add additional load/store test cases.Feb 24 2021, 7:58 AM

Rebase this patch to the latest changes.

amyk mentioned this in D95115: [PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests.Mar 15 2021, 6:43 AM

amyk mentioned this in rGe582c073d19b: [NFC][PowerPC] Add additional load/store test cases.Mar 15 2021, 6:55 AM

Harbormaster completed remote builds in B93800: Diff 330636.Mar 15 2021, 7:14 AM

Thank you for handling this much needed refactoring. My comments are mostly related to readability so this is very close to approval, but let's have another look when you address the comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1432	s/instruction formats/addressing modes
1437	I think you should add some comments regarding sample instructions that correspond to these sets of flags. For example: PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_WordInt, // LWZ PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt, // LBZ, LHZ PPC::MOF_SExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt, // LHA ...
2521	I think we no longer need to put the name of the function in Doxygen comments.
2523–2524	Replace: This is for when we have an OR of disjoint bitfields, we can codegen it as an add (for better address arithmetic). with An OR of two provably disjoint values is equivalent to an ADD. Most PPC load/store instructions compute the effective address as a sum, so doing this conversion is useful.
16952	s/return an X-Form instructions can always be matched/return an X-Form as it is the most general addressing mode
16959–16960	Remove this part of the comment. That is not done here AFAICT.
16970	I think this should be checking for prefixed instructions. It is entirely possible to have `-mcpu=pwr9 -Xclang -target-feature -Xclang -prefix-instrs` (or `llc -mattr=-prefix-instrs`).
16983	This is a large lambda that is only called once. AFAICT, it only captures `FlagSet`. I think it is probably better off as a `static` function and `FlagSet` can be passed by reference. Removing it will also remove `SetAlignFlagsForImm` from this function and the whole thing should become significantly more readable. Also as a minor nit, you are less likely to have these mismatches in naming/capitalization such as between `SetAlignFlagsForImm` and `computeFlagsForAddressComputation`. Both are lambdas, one is capitalized as a variable and the other as a function - understandable omission since a lambda is both.
17044	s/Vectors only/Integer vectors.
17055	Seems like we should have another unreachable here in case we end up with an illegal floating point type such as half precision or some weird FP type (such as Intel's 80-bit FP).
17078	Add this to the comment: // We set the extension mode to zero extension so we don't have // to add separate entries in AddrModesMap for stores and loads.
17084–17085	// If we don't have prefixed instructions, 34-bit constants should be // treated as PPC::MOF_NotAddNorCst so they can match D-Forms.
17086–17089	bool Is34BitConstNoP10 = (PPC::MOF_RPlusSImm34 \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_SubtargetP10) & FlagSet == PPC::MOF_RPlusSImm34; if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR && IsNonP1034BitConst) FlagSet \|= PPC::MOF_NotAddNorCst;
17156	This condition seems superfluous. Why do we check that the operand is a signed 16-bit constant (and then create another constant with the same value) when we have `PPC::MOF_RPlusSImm16` set? Doesn't the flag already assure us that this is so? Can we not just assert that this is so?
17191	Why can't this be `(CNType == MVT::i32 \|\| isInt<32>(CNImm))`
17193	For purposes where the width of the value matters, please refrain from using types with an implicit size and use explicitly sized types (i.e. `int32_t` vs. `int`, `int16_t` vs. `short`).
llvm/lib/Target/PowerPC/PPCISelLowering.h
678	`// Extension mode for integer loads.`
684–692	MOF_NotAddNorCst = 1 << 5, // Not const. or sum of ptr and scalar. MOF_RPlusSImm16 = 1 << 6, // Reg plus signed 16-bit constant. MOF_RPlusLo = 1 << 7, // Reg plus signed 16-bit relocation MOF_RPlusSImm16Mult4 = 1 << 8, // Reg plus 16-bit signed multiple of 4. MOF_RPlusSImm16Mult16 = 1 << 9, // Reg plus 16-bit signed multiple of 16. MOF_RPlusSImm34 = 1 << 10, // Reg plus 34-bit signed constant. MOF_RPlusR = 1 << 11, // Sum of two variables. MOF_PCRel = 1 << 12, // PC-Relative relocation. MOF_AddrIsSImm32 = 1 << 13, // A simple 32-bit constant.
698–699	MOF_ScalarFloat = 1 << 18, // Scalar single or double precision. MOF_Vector = 1 << 19, // Vector types and quad precision scalars.
1383	s/are are/are

This revision now requires changes to proceed.Mar 20 2021, 3:42 PM

I understand this refactoring was done to simplify adding instructions both now and in the future. Would it be possible to add comments in the code to instruct the next developer that attempts to add instructions on the how to do it.

amyk marked 24 inline comments as done.Mar 23 2021, 7:51 AM

amyk added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16970	I actually added a flag for prefixed instructions in https://reviews.llvm.org/D96075. Is this approach still acceptable, or would you prefer it to be done here instead in the approach you mentioned?

Create a static function to compute flags for address computation
Create static function to set alignment flags for FrameIndex
Update comments

Harbormaster completed remote builds in B95686: Diff 333291.Mar 25 2021, 8:02 AM

Rebase patch
Add documentation regarding refactored load and store implementation
Add the PPC::MOF_SubtargetP10 flag if the subtarget has prefixed instructions

Harbormaster completed remote builds in B99410: Diff 338417.Apr 18 2021, 8:21 PM

The rest of the comments I have are minor comment and code restructuring nits. Those can be addressed on the commit. Otherwise, LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16963	Please reduce nesting by flipping this and early-exiting: FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(IsAdd ? N.getOperand(0) : N); if (!FI) return; // The rest of the code is not nested in this if
16966	It might be clear to call this something like `FrameIndexAlign` since it seems to refer to alignment rather than a value.
17194	// This is a register plus a 16-bit immediate. The base will be the // register and the displacement will be the immediate unless it // isn't sufficiently aligned.
17201–17206	Disp = DAG.getTargetConstant(Imm, DL, N.getValueType()); Base = Op0; if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) { Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType()); fixupFuncForFI(DAG, FI->getIndex(), N.getValueType()); }
17210	This is not necessarily a load I think. // This is a register plus the @lo relocation. The base is the register // and the displacement is the global address.
17220	// This is a constant address at most 32 bits. The base will be // zero or load-immediate-shifted and the displacement will be // the low 16 bits of the address.

This revision is now accepted and ready to land.Apr 23 2021, 2:29 PM

This revision was landed with ongoing or failed builds.Apr 30 2021, 7:53 AM

Closed by commit rG64d951be61aa: [PowerPC] Add new infrastructure to select load/store instructions, update… (authored by amyk). · Explain Why

This revision was automatically updated to reflect the committed changes.

amyk added a commit: rG64d951be61aa: [PowerPC] Add new infrastructure to select load/store instructions, update….

amyk mentioned this in rG1998a086551c: [PowerPC][NFC] Update atomic patterns to use the refactored load/store….May 4 2021, 8:47 AM

amyk mentioned this in rGba627a32e125: [PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and….Jul 16 2021, 7:29 AM

amyk mentioned this in rG351a0d8a9053: [PowerPC] Update PC-Relative Load/Store Patterns to use the refactored….Sep 9 2021, 1:40 PM

amyk mentioned this in rG5041a485b948: [PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store….Sep 14 2021, 6:40 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

39 lines

70 lines

436 lines

93 lines

30 lines

91 lines

48 lines

942 lines

test/

CodeGen/

PowerPC/

p9-dform-load-alignment.ll

2 lines

Diff 341906

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 223 Lines • ▼ Show 20 Lines	bool SelectAddrImmOffs(SDValue N, SDValue &Out) const {
N.getOpcode() == ISD::TargetGlobalAddress) {		N.getOpcode() == ISD::TargetGlobalAddress) {
Out = N;		Out = N;
return true;		return true;
}		}

return false;		return false;
}		}

		/// SelectDSForm - Returns true if address N can be represented by the
		/// addressing mode of DSForm instructions (a base register, plus a signed
		/// 16-bit displacement that is a multiple of 4.
		bool SelectDSForm(SDNode *Parent, SDValue N, SDValue &Disp, SDValue &Base) {
		return PPCLowering->SelectOptimalAddrMode(Parent, N, Disp, Base, *CurDAG,
		Align(4)) == PPC::AM_DSForm;
		}

		/// SelectDQForm - Returns true if address N can be represented by the
		/// addressing mode of DQForm instructions (a base register, plus a signed
		/// 16-bit displacement that is a multiple of 16.
		bool SelectDQForm(SDNode *Parent, SDValue N, SDValue &Disp, SDValue &Base) {
		return PPCLowering->SelectOptimalAddrMode(Parent, N, Disp, Base, *CurDAG,
		Align(16)) == PPC::AM_DQForm;
		}

		/// SelectDForm - Returns true if address N can be represented by
		/// the addressing mode of DForm instructions (a base register, plus a
		/// signed 16-bit immediate.
		bool SelectDForm(SDNode *Parent, SDValue N, SDValue &Disp, SDValue &Base) {
		return PPCLowering->SelectOptimalAddrMode(Parent, N, Disp, Base, *CurDAG,
		None) == PPC::AM_DForm;
		}

		/// SelectXForm - Returns true if address N can be represented by the
		/// addressing mode of XForm instructions (an indexed [r+r] operation).
		bool SelectXForm(SDNode *Parent, SDValue N, SDValue &Disp, SDValue &Base) {
		return PPCLowering->SelectOptimalAddrMode(Parent, N, Disp, Base, *CurDAG,
		None) == PPC::AM_XForm;
		}

		/// SelectForceXForm - Given the specified address, force it to be
		/// represented as an indexed [r+r] operation (an XForm instruction).
		bool SelectForceXForm(SDNode *Parent, SDValue N, SDValue &Disp,
		SDValue &Base) {
		return PPCLowering->SelectForceXFormMode(N, Disp, Base, *CurDAG) ==
		PPC::AM_XForm;
		}

/// SelectAddrIdx - Given the specified address, check to see if it can be		/// SelectAddrIdx - Given the specified address, check to see if it can be
/// represented as an indexed [r+r] operation.		/// represented as an indexed [r+r] operation.
/// This is for xform instructions whose associated displacement form is D.		/// This is for xform instructions whose associated displacement form is D.
/// The last parameter \p 0 means associated D form has no requirment for 16		/// The last parameter \p 0 means associated D form has no requirment for 16
/// bit signed displacement.		/// bit signed displacement.
/// Returns false if it can be represented by [r+imm], which are preferred.		/// Returns false if it can be represented by [r+imm], which are preferred.
bool SelectAddrIdx(SDValue N, SDValue &Base, SDValue &Index) {		bool SelectAddrIdx(SDValue N, SDValue &Base, SDValue &Index) {
return PPCLowering->SelectAddressRegReg(N, Base, Index, *CurDAG, None);		return PPCLowering->SelectAddressRegReg(N, Base, Index, *CurDAG, None);
▲ Show 20 Lines • Show All 6,977 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 665 Lines • ▼ Show 20 Lines	unsigned getSplatIdxForPPCMnemonics(SDNode *N, unsigned EltSize,
SelectionDAG &DAG);		SelectionDAG &DAG);

/// get_VSPLTI_elt - If this is a build_vector of constants which can be		/// get_VSPLTI_elt - If this is a build_vector of constants which can be
/// formed by using a vspltis[bhw] instruction of the specified element		/// formed by using a vspltis[bhw] instruction of the specified element
/// size, return the constant being splatted. The ByteSize field indicates		/// size, return the constant being splatted. The ByteSize field indicates
/// the number of bytes of each element [124] -> [bhw].		/// the number of bytes of each element [124] -> [bhw].
SDValue get_VSPLTI_elt(SDNode *N, unsigned ByteSize, SelectionDAG &DAG);		SDValue get_VSPLTI_elt(SDNode *N, unsigned ByteSize, SelectionDAG &DAG);

		// Flags for computing the optimal addressing mode for loads and stores.
		enum MemOpFlags {
		MOF_None = 0,

		// Extension mode for integer loads.
		nemanjaiUnsubmitted Done Reply Inline Actions `// Extension mode for integer loads.` nemanjai: `// Extension mode for integer loads.`
		MOF_SExt = 1,
		MOF_ZExt = 1 << 1,
		MOF_NoExt = 1 << 2,

		// Address computation flags.
		MOF_NotAddNorCst = 1 << 5, // Not const. or sum of ptr and scalar.
		MOF_RPlusSImm16 = 1 << 6, // Reg plus signed 16-bit constant.
		MOF_RPlusLo = 1 << 7, // Reg plus signed 16-bit relocation
		MOF_RPlusSImm16Mult4 = 1 << 8, // Reg plus 16-bit signed multiple of 4.
		MOF_RPlusSImm16Mult16 = 1 << 9, // Reg plus 16-bit signed multiple of 16.
		MOF_RPlusSImm34 = 1 << 10, // Reg plus 34-bit signed constant.
		MOF_RPlusR = 1 << 11, // Sum of two variables.
		MOF_PCRel = 1 << 12, // PC-Relative relocation.
		MOF_AddrIsSImm32 = 1 << 13, // A simple 32-bit constant.
		nemanjaiUnsubmitted Done Reply Inline Actions MOF_NotAddNorCst = 1 << 5, // Not const. or sum of ptr and scalar. MOF_RPlusSImm16 = 1 << 6, // Reg plus signed 16-bit constant. MOF_RPlusLo = 1 << 7, // Reg plus signed 16-bit relocation MOF_RPlusSImm16Mult4 = 1 << 8, // Reg plus 16-bit signed multiple of 4. MOF_RPlusSImm16Mult16 = 1 << 9, // Reg plus 16-bit signed multiple of 16. MOF_RPlusSImm34 = 1 << 10, // Reg plus 34-bit signed constant. MOF_RPlusR = 1 << 11, // Sum of two variables. MOF_PCRel = 1 << 12, // PC-Relative relocation. MOF_AddrIsSImm32 = 1 << 13, // A simple 32-bit constant. nemanjai: ``` MOF_NotAddNorCst = 1 << 5, // Not const. or sum of ptr and scalar.

		// The in-memory type.
		MOF_SubWordInt = 1 << 15,
		MOF_WordInt = 1 << 16,
		MOF_DoubleWordInt = 1 << 17,
		MOF_ScalarFloat = 1 << 18, // Scalar single or double precision.
		MOF_Vector = 1 << 19, // Vector types and quad precision scalars.
		nemanjaiUnsubmitted Done Reply Inline Actions MOF_ScalarFloat = 1 << 18, // Scalar single or double precision. MOF_Vector = 1 << 19, // Vector types and quad precision scalars. nemanjai: ``` MOF_ScalarFloat = 1 << 18, // Scalar single or double precision. MOF_Vector = 1…
		MOF_Vector256 = 1 << 20,
		bsaleilUnsubmitted Done Reply Inline Actions The flag names are not really uniform, I guess DWInt is for DoubleWordInt ? I think we should drop the abbreviation here so it is more clear what the flags represent. Something like: MOF_SubWordInt = 1 << 15, MOF_WordInt = 1 << 16, MOF_DoubleWordInt = 1 << 17, MOF_ScalarFloat = 1 << 18, MOF_Vector = 1 << 19, MOF_Vector256 = 1 << 20, bsaleil: The flag names are not really uniform, I guess DWInt is for DoubleWordInt ? I think we should…

		// Subtarget features.
		shchenzUnsubmitted Done Reply Inline Actions The following features should not be associated to one specific memory operation? Should we add them into the flag set of each memory operation in `computeMOFlags`? Is it possible to get the sub-target info when we do am selection after we get all flags? shchenz: The following features should not be associated to one specific memory operation? Should we add…
		amykAuthorUnsubmitted Done Reply Inline Actions I apologize, I think I do not quite follow. If you could clarify, that would be great. Are you suggesting that these should not be flags, but more of Subtarget checks after the flags are computed? I think it can be useful to store the subtarget information in the set of flags is that we can easily know which instructions we can produce on P9 and P10. I was thinking for instance, if we have flags set for `SubtargetP9`, `ScalFlt` and `RPlusSImm16Mult4` (register + signed 16 bit immediate, multiple of 4), then we know we can generate the DS-Form (corresponding to `DFLOAD` pseudoinstruction. Or, when we have `PPC::MOF_SubtargetP10` and `PPC::MOF_RPlusSImm34` (signed 34-bit immediate), we can know we can generate prefixed load/stores. amyk: I apologize, I think I do not quite follow. If you could clarify, that would be great. Are you…
		shchenzUnsubmitted Done Reply Inline Actions Are you suggesting that these should not be flags, but more of Subtarget checks after the flags are computed? Yes, that was my first thought. For normal load/store instructions before ISEL, we can get type, address info(zext/sext/imm) from the instruction itself. But sub-target is not loads/stores characteristic. we can check the sub-target info when we do the address mode selection when we have other flags. But I am ok if making the sub-target as a flag for load/store for implementation convenience. shchenz: > Are you suggesting that these should not be flags, but more of Subtarget checks after the…
		MOF_SubtargetBeforeP9 = 1 << 22,
		bsaleilUnsubmitted Done Reply Inline Actions Maybe we should name this flag `MOF_SubtargetBeforeP9` or something like that, to make it clear that the flag is not set on P10. bsaleil: Maybe we should name this flag `MOF_SubtargetBeforeP9` or something like that, to make it clear…
		MOF_SubtargetP9 = 1 << 23,
		MOF_SubtargetP10 = 1 << 24,
		MOF_SubtargetSPE = 1 << 25
		};

		// The addressing modes for loads and stores.
		enum AddrMode {
		AM_None,
		AM_DForm,
		AM_DSForm,
		AM_DQForm,
		AM_XForm,
		};
} // end namespace PPC		} // end namespace PPC

class PPCTargetLowering : public TargetLowering {		class PPCTargetLowering : public TargetLowering {
const PPCSubtarget &Subtarget;		const PPCSubtarget &Subtarget;

		amykAuthorUnsubmitted Done Reply Inline Actions Add comments to describe the functions. amyk: Add comments to describe the functions.
public:		public:
explicit PPCTargetLowering(const PPCTargetMachine &TM,		explicit PPCTargetLowering(const PPCTargetMachine &TM,
		bsaleilUnsubmitted Done Reply Inline Actions I think the map and the function declaration should be moved with the others private fields below. bsaleil: I think the map and the function declaration should be moved with the others private fields…
const PPCSubtarget &STI);		const PPCSubtarget &STI);

/// getTargetNodeName() - This method returns the name of a target specific		/// getTargetNodeName() - This method returns the name of a target specific
/// DAG node.		/// DAG node.
const char *getTargetNodeName(unsigned Opcode) const override;		const char *getTargetNodeName(unsigned Opcode) const override;

bool isSelectSupported(SelectSupportKind Kind) const override {		bool isSelectSupported(SelectSupportKind Kind) const override {
// PowerPC does not support scalar condition selects on vectors.		// PowerPC does not support scalar condition selects on vectors.
▲ Show 20 Lines • Show All 347 Lines • ▼ Show 20 Lines	public:
unsigned getJumpTableEncoding() const override;		unsigned getJumpTableEncoding() const override;
bool isJumpTableRelative() const override;		bool isJumpTableRelative() const override;
SDValue getPICJumpTableRelocBase(SDValue Table,		SDValue getPICJumpTableRelocBase(SDValue Table,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;
const MCExpr getPICJumpTableRelocBaseExpr(const MachineFunction MF,		const MCExpr getPICJumpTableRelocBaseExpr(const MachineFunction MF,
unsigned JTI,		unsigned JTI,
MCContext &Ctx) const override;		MCContext &Ctx) const override;

		/// SelectOptimalAddrMode - Based on a node N and it's Parent (a MemSDNode),
		/// compute the address flags of the node, get the optimal address mode
		/// based on the flags, and set the Base and Disp based on the address mode.
		bsaleilUnsubmitted Done Reply Inline Actions This method should be private. bsaleil: This method should be private.
		PPC::AddrMode SelectOptimalAddrMode(const SDNode *Parent, SDValue N,
		SDValue &Disp, SDValue &Base,
		SelectionDAG &DAG,
		MaybeAlign Align) const;
		/// SelectForceXFormMode - Given the specified address, force it to be
		bsaleilUnsubmitted Done Reply Inline Actions This method should be private. bsaleil: This method should be private.
		/// represented as an indexed [r+r] operation (an XForm instruction).
		PPC::AddrMode SelectForceXFormMode(SDValue N, SDValue &Disp, SDValue &Base,
		SelectionDAG &DAG) const;

/// Structure that collects some common arguments that get passed around		/// Structure that collects some common arguments that get passed around
/// between the functions for call lowering.		/// between the functions for call lowering.
struct CallFlags {		struct CallFlags {
const CallingConv::ID CallConv;		const CallingConv::ID CallConv;
const bool IsTailCall : 1;		const bool IsTailCall : 1;
const bool IsVarArg : 1;		const bool IsVarArg : 1;
const bool IsPatchPoint : 1;		const bool IsPatchPoint : 1;
const bool IsIndirect : 1;		const bool IsIndirect : 1;
Show All 26 Lines	struct ReuseLoadInfo {
if (IsDereferenceable)		if (IsDereferenceable)
F \|= MachineMemOperand::MODereferenceable;		F \|= MachineMemOperand::MODereferenceable;
if (IsInvariant)		if (IsInvariant)
F \|= MachineMemOperand::MOInvariant;		F \|= MachineMemOperand::MOInvariant;
return F;		return F;
}		}
};		};

		// Map that relates a set of common address flags to PPC addressing modes.
		std::map<PPC::AddrMode, SmallVector<unsigned, 16>> AddrModesMap;
		void initializeAddrModeMap();

bool canReuseLoadAddress(SDValue Op, EVT MemVT, ReuseLoadInfo &RLI,		bool canReuseLoadAddress(SDValue Op, EVT MemVT, ReuseLoadInfo &RLI,
SelectionDAG &DAG,		SelectionDAG &DAG,
ISD::LoadExtType ET = ISD::NON_EXTLOAD) const;		ISD::LoadExtType ET = ISD::NON_EXTLOAD) const;
void spliceIntoChain(SDValue ResChain, SDValue NewResChain,		void spliceIntoChain(SDValue ResChain, SDValue NewResChain,
SelectionDAG &DAG) const;		SelectionDAG &DAG) const;

void LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,		void LowerFP_TO_INTForReuse(SDValue Op, ReuseLoadInfo &RLI,
SelectionDAG &DAG, const SDLoc &dl) const;		SelectionDAG &DAG, const SDLoc &dl) const;
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	private:
SDValue lowerToXXSPLTI32DX(ShuffleVectorSDNode *N, SelectionDAG &DAG) const;		SDValue lowerToXXSPLTI32DX(ShuffleVectorSDNode *N, SelectionDAG &DAG) const;

// Return whether the call instruction can potentially be optimized to a		// Return whether the call instruction can potentially be optimized to a
// tail call. This will cause the optimizers to attempt to move, or		// tail call. This will cause the optimizers to attempt to move, or
// duplicate return instructions to help enable tail call optimizations.		// duplicate return instructions to help enable tail call optimizations.
bool mayBeEmittedAsTailCall(const CallInst *CI) const override;		bool mayBeEmittedAsTailCall(const CallInst *CI) const override;
bool hasBitPreservingFPLogic(EVT VT) const override;		bool hasBitPreservingFPLogic(EVT VT) const override;
bool isMaskAndCmp0FoldingBeneficial(const Instruction &AndI) const override;		bool isMaskAndCmp0FoldingBeneficial(const Instruction &AndI) const override;

		/// getAddrModeForFlags - Based on the set of address flags, select the most
		/// optimal instruction format to match by.
		PPC::AddrMode getAddrModeForFlags(unsigned Flags) const;

		/// computeMOFlags - Given a node N and it's Parent (a MemSDNode), compute
		/// the address flags of the load/store instruction that is to be matched.
		/// The address flags are stored in a map, which is then searched
		nemanjaiUnsubmitted Done Reply Inline Actions s/are are/are nemanjai: s/are are/are
		/// through to determine the optimal load/store instruction format.
		unsigned computeMOFlags(const SDNode *Parent, SDValue N,
		SelectionDAG &DAG) const;
}; // end class PPCTargetLowering		}; // end class PPCTargetLowering

namespace PPC {		namespace PPC {

FastISel *createFastISel(FunctionLoweringInfo &FuncInfo,		FastISel *createFastISel(FunctionLoweringInfo &FuncInfo,
const TargetLibraryInfo *LibInfo);		const TargetLibraryInfo *LibInfo);

} // end namespace PPC		} // end namespace PPC
Show All 13 Lines

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines
static SDValue widenVec(SelectionDAG &DAG, SDValue Vec, const SDLoc &dl);		static SDValue widenVec(SelectionDAG &DAG, SDValue Vec, const SDLoc &dl);

// FIXME: Remove this once the bug has been fixed!		// FIXME: Remove this once the bug has been fixed!
extern cl::opt<bool> ANDIGlueBug;		extern cl::opt<bool> ANDIGlueBug;

PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,		PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
const PPCSubtarget &STI)		const PPCSubtarget &STI)
: TargetLowering(TM), Subtarget(STI) {		: TargetLowering(TM), Subtarget(STI) {
		// Initialize map that relates the PPC addressing modes to the computed flags
		// of a load/store instruction. The map is used to determine the optimal
		// addressing mode when selecting load and stores.
		initializeAddrModeMap();
// On PPC32/64, arguments smaller than 4/8 bytes are extended, so all		// On PPC32/64, arguments smaller than 4/8 bytes are extended, so all
// arguments are at least 4/8 bytes aligned.		// arguments are at least 4/8 bytes aligned.
bool isPPC64 = Subtarget.isPPC64();		bool isPPC64 = Subtarget.isPPC64();
setMinStackArgumentAlignment(isPPC64 ? Align(8) : Align(4));		setMinStackArgumentAlignment(isPPC64 ? Align(8) : Align(4));

// Set up the register classes.		// Set up the register classes.
addRegisterClass(MVT::i32, &PPC::GPRCRegClass);		addRegisterClass(MVT::i32, &PPC::GPRCRegClass);
if (!useSoftFloat()) {		if (!useSoftFloat()) {
▲ Show 20 Lines • Show All 1,272 Lines • ▼ Show 20 Lines
IsStrictFPEnabled = true;		IsStrictFPEnabled = true;

// Let the subtarget (CPU) decide if a predictable select is more expensive		// Let the subtarget (CPU) decide if a predictable select is more expensive
// than the corresponding branch. This information is used in CGP to decide		// than the corresponding branch. This information is used in CGP to decide
// when to convert selects into branches.		// when to convert selects into branches.
PredictableSelectIsExpensive = Subtarget.isPredictableSelectIsExpensive();		PredictableSelectIsExpensive = Subtarget.isPredictableSelectIsExpensive();
}		}

		// ********************************* NOTE **********************************
		// For selecting load and store instructions, the addressing modes are defined
		nemanjaiUnsubmitted Done Reply Inline Actions s/instruction formats/addressing modes nemanjai: s/instruction formats/addressing modes
		// as ComplexPatterns in PPCInstrInfo.td, which are then utilized in the TD
		// patterns to match the load the store instructions.
		//
		// The TD definitions for the addressing modes correspond to their respective
		// Select<AddrMode>Form() function in PPCISelDAGToDAG.cpp. These functions rely
		nemanjaiUnsubmitted Done Reply Inline Actions I think you should add some comments regarding sample instructions that correspond to these sets of flags. For example: PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_WordInt, // LWZ PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt, // LBZ, LHZ PPC::MOF_SExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt, // LHA ... nemanjai: I think you should add some comments regarding sample instructions that correspond to these…
		// on SelectOptimalAddrMode(), which calls computeMOFlags() to compute the
		// address mode flags of a particular node. Afterwards, the computed address
		// flags are passed into getAddrModeForFlags() in order to retrieve the optimal
		// addressing mode. SelectOptimalAddrMode() then sets the Base and Displacement
		// accordingly, based on the preferred addressing mode.
		//
		// Within PPCISelLowering.h, there are two enums: MemOpFlags and AddrMode.
		// MemOpFlags contains all the possible flags that can be used to compute the
		// optimal addressing mode for load and store instructions.
		// AddrMode contains all the possible load and store addressing modes available
		// on Power (such as DForm, DSForm, DQForm, XForm, etc.)
		//
		// When adding new load and store instructions, it is possible that new address
		// flags may need to be added into MemOpFlags, and a new addressing mode will
		// need to be added to AddrMode. An entry of the new addressing mode (consisting
		// of the minimal and main distinguishing address flags for the new load/store
		// instructions) will need to be added into initializeAddrModeMap() below.
		// Finally, when adding new addressing modes, the getAddrModeForFlags() will
		// need to be updated to account for selecting the optimal addressing mode.
		// *****************************************************************************
		/// Initialize the map that relates the different addressing modes of the load
		/// and store instructions to a set of flags. This ensures the load/store
		/// instruction is correctly matched during instruction selection.
		void PPCTargetLowering::initializeAddrModeMap() {
		AddrModesMap[PPC::AM_DForm] = {
		// LWZ, STW
		PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_WordInt,
		PPC::MOF_ZExt \| PPC::MOF_RPlusLo \| PPC::MOF_WordInt,
		PPC::MOF_ZExt \| PPC::MOF_NotAddNorCst \| PPC::MOF_WordInt,
		PPC::MOF_ZExt \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_WordInt,
		// LBZ, LHZ, STB, STH
		PPC::MOF_ZExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt,
		PPC::MOF_ZExt \| PPC::MOF_RPlusLo \| PPC::MOF_SubWordInt,
		PPC::MOF_ZExt \| PPC::MOF_NotAddNorCst \| PPC::MOF_SubWordInt,
		PPC::MOF_ZExt \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_SubWordInt,
		// LHA
		PPC::MOF_SExt \| PPC::MOF_RPlusSImm16 \| PPC::MOF_SubWordInt,
		PPC::MOF_SExt \| PPC::MOF_RPlusLo \| PPC::MOF_SubWordInt,
		PPC::MOF_SExt \| PPC::MOF_NotAddNorCst \| PPC::MOF_SubWordInt,
		PPC::MOF_SExt \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_SubWordInt,
		// LFS, LFD, STFS, STFD
		PPC::MOF_RPlusSImm16 \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetBeforeP9,
		PPC::MOF_RPlusLo \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetBeforeP9,
		PPC::MOF_NotAddNorCst \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetBeforeP9,
		PPC::MOF_AddrIsSImm32 \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetBeforeP9,
		};
		AddrModesMap[PPC::AM_DSForm] = {
		// LWA
		PPC::MOF_SExt \| PPC::MOF_RPlusSImm16Mult4 \| PPC::MOF_WordInt,
		PPC::MOF_SExt \| PPC::MOF_NotAddNorCst \| PPC::MOF_WordInt,
		PPC::MOF_SExt \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_WordInt,
		// LD, STD
		PPC::MOF_RPlusSImm16Mult4 \| PPC::MOF_DoubleWordInt,
		PPC::MOF_NotAddNorCst \| PPC::MOF_DoubleWordInt,
		PPC::MOF_AddrIsSImm32 \| PPC::MOF_DoubleWordInt,
		// DFLOADf32, DFLOADf64, DSTOREf32, DSTOREf64
		PPC::MOF_RPlusSImm16Mult4 \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetP9,
		PPC::MOF_NotAddNorCst \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetP9,
		PPC::MOF_AddrIsSImm32 \| PPC::MOF_ScalarFloat \| PPC::MOF_SubtargetP9,
		};
		AddrModesMap[PPC::AM_DQForm] = {
		// LXV, STXV
		PPC::MOF_RPlusSImm16Mult16 \| PPC::MOF_Vector \| PPC::MOF_SubtargetP9,
		PPC::MOF_NotAddNorCst \| PPC::MOF_Vector \| PPC::MOF_SubtargetP9,
		PPC::MOF_AddrIsSImm32 \| PPC::MOF_Vector \| PPC::MOF_SubtargetP9,
		PPC::MOF_RPlusSImm16Mult16 \| PPC::MOF_Vector256 \| PPC::MOF_SubtargetP10,
		PPC::MOF_NotAddNorCst \| PPC::MOF_Vector256 \| PPC::MOF_SubtargetP10,
		PPC::MOF_AddrIsSImm32 \| PPC::MOF_Vector256 \| PPC::MOF_SubtargetP10,
		};
		}

/// getMaxByValAlign - Helper for getByValTypeAlignment to determine		/// getMaxByValAlign - Helper for getByValTypeAlignment to determine
/// the desired ByVal argument alignment.		/// the desired ByVal argument alignment.
static void getMaxByValAlign(Type *Ty, Align &MaxAlign, Align MaxMaxAlign) {		static void getMaxByValAlign(Type *Ty, Align &MaxAlign, Align MaxMaxAlign) {
if (MaxAlign == MaxMaxAlign)		if (MaxAlign == MaxMaxAlign)
return;		return;
if (VectorType *VTy = dyn_cast<VectorType>(Ty)) {		if (VectorType *VTy = dyn_cast<VectorType>(Ty)) {
if (MaxMaxAlign >= 32 &&		if (MaxMaxAlign >= 32 &&
VTy->getPrimitiveSizeInBits().getFixedSize() >= 256)		VTy->getPrimitiveSizeInBits().getFixedSize() >= 256)
▲ Show 20 Lines • Show All 996 Lines • ▼ Show 20 Lines	if (N->getValueType(0) == MVT::i32)
return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();		return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();
else		else
return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();		return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();
}		}
bool llvm::isIntS16Immediate(SDValue Op, int16_t &Imm) {		bool llvm::isIntS16Immediate(SDValue Op, int16_t &Imm) {
return isIntS16Immediate(Op.getNode(), Imm);		return isIntS16Immediate(Op.getNode(), Imm);
}		}

		/// Used when computing address flags for selecting loads and stores.
		nemanjaiUnsubmitted Done Reply Inline Actions I think we no longer need to put the name of the function in Doxygen comments. nemanjai: I think we no longer need to put the name of the function in Doxygen comments.
		/// If we have an OR, check if the LHS and RHS are provably disjoint.
		/// An OR of two provably disjoint values is equivalent to an ADD.
		/// Most PPC load/store instructions compute the effective address as a sum,
		nemanjaiUnsubmitted Done Reply Inline Actions Replace: This is for when we have an OR of disjoint bitfields, we can codegen it as an add (for better address arithmetic). with An OR of two provably disjoint values is equivalent to an ADD. Most PPC load/store instructions compute the effective address as a sum, so doing this conversion is useful. nemanjai: Replace: ``` This is for when we have an OR of disjoint bitfields, we can codegen it as an add…
		/// so doing this conversion is useful.
		static bool provablyDisjointOr(SelectionDAG &DAG, const SDValue &N) {
		if (N.getOpcode() != ISD::OR)
		return false;
		KnownBits LHSKnown = DAG.computeKnownBits(N.getOperand(0));
		stefanpUnsubmitted Done Reply Inline Actions I find it odd that you are zero extending this (ie unsigned) and then casting it to a signed. You can get into all kinds of issues with this kind of thing. For example: int16_t a = -1; // This is 0xFFFF int32_t b = zeroExtend(a); // This is 0x0000FFFF (Not -1 but 65535) It is important to think about what the possible value types are for `N`. If the only possible types are `MVT::i32` and `MVT::i64` then we are fine. stefanp: I find it odd that you are zero extending this (ie unsigned) and then casting it to a signed.
		if (!LHSKnown.Zero.getBoolValue())
		return false;
		KnownBits RHSKnown = DAG.computeKnownBits(N.getOperand(1));
		return (~(LHSKnown.Zero \| RHSKnown.Zero) == 0);
		stefanpUnsubmitted Done Reply Inline Actions You can probably simplify this a little bit. See what others think too... Imm = (int32_t)cast<ConstantSDNode>(N)->getZExtValue(); int64_t Imm64 = (int64_t)cast<ConstantSDNode>(N)->getZExtValue(); return isInt<32>(Imm64); stefanp: You can probably simplify this a little bit. See what others think too... ``` Imm =…
		}

/// SelectAddressEVXRegReg - Given the specified address, check to see if it can		/// SelectAddressEVXRegReg - Given the specified address, check to see if it can
/// be represented as an indexed [r+r] operation.		/// be represented as an indexed [r+r] operation.
bool PPCTargetLowering::SelectAddressEVXRegReg(SDValue N, SDValue &Base,		bool PPCTargetLowering::SelectAddressEVXRegReg(SDValue N, SDValue &Base,
SDValue &Index,		SDValue &Index,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
for (SDNode::use_iterator UI = N->use_begin(), E = N->use_end();		for (SDNode::use_iterator UI = N->use_begin(), E = N->use_end();
UI != E; ++UI) {		UI != E; ++UI) {
if (MemSDNode Memop = dyn_cast<MemSDNode>(UI)) {		if (MemSDNode Memop = dyn_cast<MemSDNode>(UI)) {
		lkailUnsubmitted Done Reply Inline Actions Should it be a `static` function? lkail: Should it be a `static` function?
		amykAuthorUnsubmitted Done Reply Inline Actions Good point. I will fix this. amyk: Good point. I will fix this.
if (Memop->getMemoryVT() == MVT::f64) {		if (Memop->getMemoryVT() == MVT::f64) {
Base = N.getOperand(0);		Base = N.getOperand(0);
Index = N.getOperand(1);		Index = N.getOperand(1);
return true;		return true;
}		}
}		}
}		}
return false;		return false;
▲ Show 20 Lines • Show All 14,375 Lines • ▼ Show 20 Lines	if (TrueOpnd.getOperand(0) == CmpOpnd1 &&
FalseOpnd.getOperand(1) == CmpOpnd1) {		FalseOpnd.getOperand(1) == CmpOpnd1) {
return DAG.getNode(PPCISD::VABSD, dl, N->getOperand(1).getValueType(),		return DAG.getNode(PPCISD::VABSD, dl, N->getOperand(1).getValueType(),
CmpOpnd1, CmpOpnd2,		CmpOpnd1, CmpOpnd2,
DAG.getTargetConstant(0, dl, MVT::i32));		DAG.getTargetConstant(0, dl, MVT::i32));
}		}

return SDValue();		return SDValue();
}		}

		/// getAddrModeForFlags - Based on the set of address flags, select the most
		/// optimal instruction format to match by.
		PPC::AddrMode PPCTargetLowering::getAddrModeForFlags(unsigned Flags) const {
		// This is not a node we should be handling here.
		if (Flags == PPC::MOF_None)
		return PPC::AM_None;
		// Unaligned D-Forms are tried first, followed by the aligned D-Forms.
		for (auto FlagSet : AddrModesMap.at(PPC::AM_DForm))
		if ((Flags & FlagSet) == FlagSet)
		return PPC::AM_DForm;
		for (auto FlagSet : AddrModesMap.at(PPC::AM_DSForm))
		if ((Flags & FlagSet) == FlagSet)
		return PPC::AM_DSForm;
		for (auto FlagSet : AddrModesMap.at(PPC::AM_DQForm))
		if ((Flags & FlagSet) == FlagSet)
		return PPC::AM_DQForm;
		// If no other forms are selected, return an X-Form as it is the most
		nemanjaiUnsubmitted Done Reply Inline Actions s/return an X-Form instructions can always be matched/return an X-Form as it is the most general addressing mode nemanjai: s/return an X-Form instructions can always be matched/return an X-Form as it is the most…
		// general addressing mode.
		return PPC::AM_XForm;
		}

		/// Set alignment flags based on whether or not the Frame Index is aligned.
		/// Utilized when computing flags for address computation when selecting
		/// load and store instructions.
		bsaleilUnsubmitted Done Reply Inline Actions `The address flags are are stored` -> `The address flags are stored` bsaleil: `The address flags are are stored` -> `The address flags are stored`
		static void setAlignFlagsForFI(SDValue N, unsigned &FlagSet,
		nemanjaiUnsubmitted Done Reply Inline Actions Remove this part of the comment. That is not done here AFAICT. nemanjai: Remove this part of the comment. That is not done here AFAICT.
		SelectionDAG &DAG) {
		bool IsAdd = ((N.getOpcode() == ISD::ADD) \|\| (N.getOpcode() == ISD::OR));
		FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(IsAdd ? N.getOperand(0) : N);
		nemanjaiUnsubmitted Not Done Reply Inline Actions Please reduce nesting by flipping this and early-exiting: FrameIndexSDNode FI = dyn_cast<FrameIndexSDNode>(IsAdd ? N.getOperand(0) : N); if (!FI) return; // The rest of the code is not nested in this if nemanjai:* Please reduce nesting by flipping this and early-exiting: ``` FrameIndexSDNode *FI =…
		if (!FI)
		return;
		const MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();
		nemanjaiUnsubmitted Not Done Reply Inline Actions It might be clear to call this something like `FrameIndexAlign` since it seems to refer to alignment rather than a value. nemanjai: It might be clear to call this something like `FrameIndexAlign` since it seems to refer to…
		unsigned FrameIndexAlign = MFI.getObjectAlign(FI->getIndex()).value();
		bsaleilUnsubmitted Done Reply Inline Actions I think you can directly use the `Subtarget` field of the `PPCTargetLowering` class instead of using a `static_cast` here. bsaleil: I think you can directly use the `Subtarget` field of the `PPCTargetLowering` class instead of…
		// If this is (add $FI, $S16Imm), the alignment flags are already set
		// based on the immediate. We just need to clear the alignment flags
		// if the FI alignment is weaker.
		nemanjaiUnsubmitted Done Reply Inline Actions I think this should be checking for prefixed instructions. It is entirely possible to have `-mcpu=pwr9 -Xclang -target-feature -Xclang -prefix-instrs` (or `llc -mattr=-prefix-instrs`). nemanjai: I think this should be checking for prefixed instructions. It is entirely possible to have `…
		amykAuthorUnsubmitted Done Reply Inline Actions I actually added a flag for prefixed instructions in https://reviews.llvm.org/D96075. Is this approach still acceptable, or would you prefer it to be done here instead in the approach you mentioned? amyk: I actually added a flag for prefixed instructions in https://reviews.llvm.org/D96075. Is this…
		if ((FrameIndexAlign % 4) != 0)
		FlagSet &= ~PPC::MOF_RPlusSImm16Mult4;
		if ((FrameIndexAlign % 16) != 0)
		FlagSet &= ~PPC::MOF_RPlusSImm16Mult16;
		// If the address is a plain FrameIndex, set alignment flags based on
		// FI alignment.
		if (!IsAdd) {
		if ((FrameIndexAlign % 4) == 0)
		FlagSet \|= PPC::MOF_RPlusSImm16Mult4;
		if ((FrameIndexAlign % 16) == 0)
		bsaleilUnsubmitted Done Reply Inline Actions This line should be moved below just before the assert. bsaleil: This line should be moved below just before the assert.
		FlagSet \|= PPC::MOF_RPlusSImm16Mult16;
		}
		}
		nemanjaiUnsubmitted Done Reply Inline Actions This is a large lambda that is only called once. AFAICT, it only captures `FlagSet`. I think it is probably better off as a `static` function and `FlagSet` can be passed by reference. Removing it will also remove `SetAlignFlagsForImm` from this function and the whole thing should become significantly more readable. Also as a minor nit, you are less likely to have these mismatches in naming/capitalization such as between `SetAlignFlagsForImm` and `computeFlagsForAddressComputation`. Both are lambdas, one is capitalized as a variable and the other as a function - understandable omission since a lambda is both. nemanjai: This is a large lambda that is only called once. AFAICT, it only captures `FlagSet`. I think it…

		/// Given a node, compute flags that are used for address computation when
		/// selecting load and store instructions. The flags computed are stored in
		/// FlagSet. This function takes into account whether the node is a constant,
		/// an ADD, OR, or a constant, and computes the address flags accordingly.
		static void computeFlagsForAddressComputation(SDValue N, unsigned &FlagSet,
		SelectionDAG &DAG) {
		// Set the alignment flags for the node depending on if the node is
		// 4-byte or 16-byte aligned.
		auto SetAlignFlagsForImm = [&](uint64_t Imm) {
		if ((Imm & 0x3) == 0)
		FlagSet \|= PPC::MOF_RPlusSImm16Mult4;
		if ((Imm & 0xf) == 0)
		FlagSet \|= PPC::MOF_RPlusSImm16Mult16;
		};

		if (ConstantSDNode *CN = dyn_cast<ConstantSDNode>(N)) {
		// All 32-bit constants can be computed as LIS + Disp.
		const APInt &ConstImm = CN->getAPIntValue();
		if (ConstImm.isSignedIntN(32)) { // Flag to handle 32-bit constants.
		FlagSet \|= PPC::MOF_AddrIsSImm32;
		SetAlignFlagsForImm(ConstImm.getZExtValue());
		setAlignFlagsForFI(N, FlagSet, DAG);
		}
		if (ConstImm.isSignedIntN(34)) // Flag to handle 34-bit constants.
		FlagSet \|= PPC::MOF_RPlusSImm34;
		else // Let constant materialization handle large constants.
		FlagSet \|= PPC::MOF_NotAddNorCst;
		} else if (N.getOpcode() == ISD::ADD \|\| provablyDisjointOr(DAG, N)) {
		// This address can be represented as an addition of:
		// - Register + Imm16 (possibly a multiple of 4/16)
		// - Register + Imm34
		// - Register + PPCISD::Lo
		// - Register + Register
		// In any case, we won't have to match this as Base + Zero.
		SDValue RHS = N.getOperand(1);
		if (ConstantSDNode *CN = dyn_cast<ConstantSDNode>(RHS)) {
		const APInt &ConstImm = CN->getAPIntValue();
		if (ConstImm.isSignedIntN(16)) {
		FlagSet \|= PPC::MOF_RPlusSImm16; // Signed 16-bit immediates.
		SetAlignFlagsForImm(ConstImm.getZExtValue());
		setAlignFlagsForFI(N, FlagSet, DAG);
		}
		if (ConstImm.isSignedIntN(34))
		bsaleilUnsubmitted Done Reply Inline Actions `consstants` -> `constants` bsaleil: `consstants` -> `constants`
		FlagSet \|= PPC::MOF_RPlusSImm34; // Signed 34-bit immediates.
		else
		FlagSet \|= PPC::MOF_RPlusR; // Register.
		} else if (RHS.getOpcode() == PPCISD::Lo &&
		!cast<ConstantSDNode>(RHS.getOperand(1))->getZExtValue())
		FlagSet \|= PPC::MOF_RPlusLo; // PPCISD::Lo.
		else
		stefanpUnsubmitted Done Reply Inline Actions For this case it may be easier to just work with the APInt instead of creating your own functions. So, you may not need `isIntS32Immediate` and `isIntS34Immediate`. Especially since you don't use Imm34. stefanp: For this case it may be easier to just work with the APInt instead of creating your own…
		FlagSet \|= PPC::MOF_RPlusR;
		} else { // The address computation is not a constant or an addition.
		setAlignFlagsForFI(N, FlagSet, DAG);
		FlagSet \|= PPC::MOF_NotAddNorCst;
		}
		}

		/// computeMOFlags - Given a node N and it's Parent (a MemSDNode), compute
		/// the address flags of the load/store instruction that is to be matched.
		unsigned PPCTargetLowering::computeMOFlags(const SDNode *Parent, SDValue N,
		nemanjaiUnsubmitted Done Reply Inline Actions s/Vectors only/Integer vectors. nemanjai: s/Vectors only/Integer vectors.
		SelectionDAG &DAG) const {
		unsigned FlagSet = PPC::MOF_None;

		// Compute subtarget flags.
		if (!Subtarget.hasP9Vector())
		FlagSet \|= PPC::MOF_SubtargetBeforeP9;
		else {
		FlagSet \|= PPC::MOF_SubtargetP9;
		if (Subtarget.hasPrefixInstrs())
		FlagSet \|= PPC::MOF_SubtargetP10;
		}
		nemanjaiUnsubmitted Done Reply Inline Actions Seems like we should have another unreachable here in case we end up with an illegal floating point type such as half precision or some weird FP type (such as Intel's 80-bit FP). nemanjai: Seems like we should have another unreachable here in case we end up with an illegal floating…
		if (Subtarget.hasSPE())
		FlagSet \|= PPC::MOF_SubtargetSPE;
		bsaleilUnsubmitted Done Reply Inline Actions Should we name this flag `MOF_NotAddNorCst` then ? bsaleil: Should we name this flag `MOF_NotAddNorCst` then ?

		// Mark this as something we don't want to handle here if it is atomic
		// or pre-increment instruction.
		if (const LSBaseSDNode *LSB = dyn_cast<LSBaseSDNode>(Parent))
		if (LSB->isIndexed())
		return PPC::MOF_None;
		if (isa<AtomicSDNode>(Parent))
		return PPC::MOF_None;

		// Compute in-memory type flags. This is based on if there are scalars,
		// floats or vectors.
		const MemSDNode *MN = dyn_cast<MemSDNode>(Parent);
		assert(MN && "Parent should be a MemSDNode!");
		EVT MemVT = MN->getMemoryVT();
		unsigned Size = MemVT.getSizeInBits();
		if (MemVT.isScalarInteger()) {
		assert(Size <= 64 && "Not expecting scalar integers larger than 8 bytes!");
		if (Size < 32)
		FlagSet \|= PPC::MOF_SubWordInt;
		else if (Size == 32)
		FlagSet \|= PPC::MOF_WordInt;
		nemanjaiUnsubmitted Done Reply Inline Actions Add this to the comment: // We set the extension mode to zero extension so we don't have // to add separate entries in AddrModesMap for stores and loads. nemanjai: Add this to the comment: ``` // We set the extension mode to zero extension so we don't have //…
		else
		FlagSet \|= PPC::MOF_DoubleWordInt;
		} else if (MemVT.isVector() && !MemVT.isFloatingPoint()) { // Integer vectors.
		if (Size == 128)
		FlagSet \|= PPC::MOF_Vector;
		else if (Size == 256)
		FlagSet \|= PPC::MOF_Vector256;
		nemanjaiUnsubmitted Done Reply Inline Actions // If we don't have prefixed instructions, 34-bit constants should be // treated as PPC::MOF_NotAddNorCst so they can match D-Forms. nemanjai: ``` // If we don't have prefixed instructions, 34-bit constants should be // treated as PPC…
		else
		llvm_unreachable("Not expecting illegal vectors!");
		} else { // Floating point type: can be scalar, f128 or vector types.
		if (Size == 32 \|\| Size == 64)
		nemanjaiUnsubmitted Done Reply Inline Actions bool Is34BitConstNoP10 = (PPC::MOF_RPlusSImm34 \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_SubtargetP10) & FlagSet == PPC::MOF_RPlusSImm34; if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR && IsNonP1034BitConst) FlagSet \|= PPC::MOF_NotAddNorCst; nemanjai: ``` bool Is34BitConstNoP10 = (PPC::MOF_RPlusSImm34 \| PPC::MOF_AddrIsSImm32 \| PPC…
		FlagSet \|= PPC::MOF_ScalarFloat;
		else if (MemVT == MVT::f128 \|\| MemVT.isVector())
		FlagSet \|= PPC::MOF_Vector;
		else
		llvm_unreachable("Not expecting illegal scalar floats!");
		}

		// Compute flags for address computation.
		computeFlagsForAddressComputation(N, FlagSet, DAG);

		// Compute type extension flags.
		if (const LoadSDNode *LN = dyn_cast<LoadSDNode>(Parent)) {
		switch (LN->getExtensionType()) {
		case ISD::SEXTLOAD:
		FlagSet \|= PPC::MOF_SExt;
		break;
		case ISD::EXTLOAD:
		case ISD::ZEXTLOAD:
		bsaleilUnsubmitted Done Reply Inline Actions I know this is code we already have in `SelectAddressRegRegOnly`, but isn't that condition weird ? Or at least it doesn't match the comment above. If I understand correctly, this is the case where we get rid of the `add`, but looking at the condition, we get rid of it only if it is not an add of a value and a 16-bit signed constant or if one of the operands doesn't have a single use. Shouldn't the condition be `(N.getOpcode() == ISD::ADD && !isIntS16Immediate(N.getOperand(1), ForceXFormImm) && N.getOperand(1).hasOneUse() && N.getOperand(0).hasOneUse())` instead ? bsaleil: I know this is code we already have in `SelectAddressRegRegOnly`, but isn't that condition…
		amykAuthorUnsubmitted Done Reply Inline Actions That's a good point, Baptiste. You're right in that this is the case where we eliminate the `add`. I thought about it a bit I believe the this condition might match the comment more (only get rid of the `add` if we do not have an `add` of a value and a signed 16-bit immediate, and the two operands don't have a single use) if (N.getOpcode() == ISD::ADD && (!(isIntS16Immediate(N.getOperand(1), imm) && N.getOperand(1).hasOneUse() && N.getOperand(0).hasOneUse()))) which I believe, should then be equivalent to the condition in the code. But yes, the condition was previously in `SelectAddressRegRegOnly()` which is why I kept it. If there are any more concerns on the condition and/or comment, I can probably adjust it. amyk: That's a good point, Baptiste. You're right in that this is the case where we eliminate the…
		FlagSet \|= PPC::MOF_ZExt;
		break;
		case ISD::NON_EXTLOAD:
		FlagSet \|= PPC::MOF_NoExt;
		break;
		}
		} else
		FlagSet \|= PPC::MOF_NoExt;

		// For integers, no extension is the same as zero extension.
		// We set the extension mode to zero extension so we don't have
		// to add separate entries in AddrModesMap for loads and stores.
		if (MemVT.isScalarInteger() && (FlagSet & PPC::MOF_NoExt)) {
		FlagSet \|= PPC::MOF_ZExt;
		FlagSet &= ~PPC::MOF_NoExt;
		}

		// If we don't have prefixed instructions, 34-bit constants should be
		// treated as PPC::MOF_NotAddNorCst so they can match D-Forms.
		bool IsNonP1034BitConst =
		((PPC::MOF_RPlusSImm34 \| PPC::MOF_AddrIsSImm32 \| PPC::MOF_SubtargetP10) &
		FlagSet) == PPC::MOF_RPlusSImm34;
		if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR &&
		IsNonP1034BitConst)
		FlagSet \|= PPC::MOF_NotAddNorCst;

		return FlagSet;
		}

		/// SelectForceXFormMode - Given the specified address, force it to be
		/// represented as an indexed [r+r] operation (an XForm instruction).
		PPC::AddrMode PPCTargetLowering::SelectForceXFormMode(SDValue N, SDValue &Disp,
		SDValue &Base,
		SelectionDAG &DAG) const {

		PPC::AddrMode Mode = PPC::AM_XForm;
		int16_t ForceXFormImm = 0;
		if (provablyDisjointOr(DAG, N) &&
		!isIntS16Immediate(N.getOperand(1), ForceXFormImm)) {
		Disp = N.getOperand(0);
		Base = N.getOperand(1);
		return Mode;
		}

		// If the address is the result of an add, we will utilize the fact that the
		// address calculation includes an implicit add. However, we can reduce
		// register pressure if we do not materialize a constant just for use as the
		// index register. We only get rid of the add if it is not an add of a
		// value and a 16-bit signed constant and both have a single use.
		nemanjaiUnsubmitted Done Reply Inline Actions This condition seems superfluous. Why do we check that the operand is a signed 16-bit constant (and then create another constant with the same value) when we have `PPC::MOF_RPlusSImm16` set? Doesn't the flag already assure us that this is so? Can we not just assert that this is so? nemanjai: This condition seems superfluous. Why do we check that the operand is a signed 16-bit constant…
		if (N.getOpcode() == ISD::ADD &&
		(!isIntS16Immediate(N.getOperand(1), ForceXFormImm) \|\|
		!N.getOperand(1).hasOneUse() \|\| !N.getOperand(0).hasOneUse())) {
		Disp = N.getOperand(0);
		Base = N.getOperand(1);
		return Mode;
		}

		// Otherwise, use R0 as the base register.
		Disp = DAG.getRegister(Subtarget.isPPC64() ? PPC::ZERO8 : PPC::ZERO,
		N.getValueType());
		Base = N;

		return Mode;
		}

		/// SelectOptimalAddrMode - Based on a node N and it's Parent (a MemSDNode),
		/// compute the address flags of the node, get the optimal address mode based
		/// on the flags, and set the Base and Disp based on the address mode.
		PPC::AddrMode PPCTargetLowering::SelectOptimalAddrMode(const SDNode *Parent,
		SDValue N, SDValue &Disp,
		SDValue &Base,
		SelectionDAG &DAG,
		MaybeAlign Align) const {
		SDLoc DL(Parent);

		// Compute the address flags.
		unsigned Flags = computeMOFlags(Parent, N, DAG);

		// Get the optimal address mode based on the Flags.
		PPC::AddrMode Mode = getAddrModeForFlags(Flags);

		// Set Base and Disp accordingly depending on the address mode.
		switch (Mode) {
		case PPC::AM_DForm:
		nemanjaiUnsubmitted Done Reply Inline Actions Why can't this be `(CNType == MVT::i32 \|\| isInt<32>(CNImm))` nemanjai: Why can't this be `(CNType == MVT::i32 \|\| isInt<32>(CNImm))`
		case PPC::AM_DSForm:
		case PPC::AM_DQForm: {
		nemanjaiUnsubmitted Done Reply Inline Actions For purposes where the width of the value matters, please refrain from using types with an implicit size and use explicitly sized types (i.e. `int32_t` vs. `int`, `int16_t` vs. `short`). nemanjai: For purposes where the width of the value matters, please refrain from using types with an…
		// This is a register plus a 16-bit immediate. The base will be the
		nemanjaiUnsubmitted Not Done Reply Inline Actions // This is a register plus a 16-bit immediate. The base will be the // register and the displacement will be the immediate unless it // isn't sufficiently aligned. nemanjai: ``` // This is a register plus a 16-bit immediate. The base will be the // register and the…
		// register and the displacement will be the immediate unless it
		// isn't sufficiently aligned.
		if (Flags & PPC::MOF_RPlusSImm16) {
		SDValue Op0 = N.getOperand(0);
		SDValue Op1 = N.getOperand(1);
		ConstantSDNode *CN = dyn_cast<ConstantSDNode>(Op1);
		int16_t Imm = CN->getAPIntValue().getZExtValue();
		if (!Align \|\| isAligned(*Align, Imm)) {
		Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());
		Base = Op0;
		if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) {
		Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());
		nemanjaiUnsubmitted Not Done Reply Inline Actions Disp = DAG.getTargetConstant(Imm, DL, N.getValueType()); Base = Op0; if (FrameIndexSDNode FI = dyn_cast<FrameIndexSDNode>(Op0)) { Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType()); fixupFuncForFI(DAG, FI->getIndex(), N.getValueType()); } nemanjai:* ``` Disp = DAG.getTargetConstant(Imm, DL, N.getValueType()); Base = Op0; if (FrameIndexSDNode…
		fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());
		}
		break;
		}
		nemanjaiUnsubmitted Not Done Reply Inline Actions This is not necessarily a load I think. // This is a register plus the @lo relocation. The base is the register // and the displacement is the global address. nemanjai: This is not necessarily a load I think. ``` // This is a register plus the @lo relocation. The…
		}
		// This is a register plus the @lo relocation. The base is the register
		// and the displacement is the global address.
		else if (Flags & PPC::MOF_RPlusLo) {
		Disp = N.getOperand(1).getOperand(0); // The global address.
		assert(Disp.getOpcode() == ISD::TargetGlobalAddress \|\|
		Disp.getOpcode() == ISD::TargetGlobalTLSAddress \|\|
		Disp.getOpcode() == ISD::TargetConstantPool \|\|
		Disp.getOpcode() == ISD::TargetJumpTable);
		Base = N.getOperand(0);
		nemanjaiUnsubmitted Not Done Reply Inline Actions // This is a constant address at most 32 bits. The base will be // zero or load-immediate-shifted and the displacement will be // the low 16 bits of the address. nemanjai: ``` // This is a constant address at most 32 bits. The base will be // zero or load-immediate…
		break;
		}
		// This is a constant address at most 32 bits. The base will be
		// zero or load-immediate-shifted and the displacement will be
		// the low 16 bits of the address.
		else if (Flags & PPC::MOF_AddrIsSImm32) {
		ConstantSDNode *CN = dyn_cast<ConstantSDNode>(N);
		EVT CNType = CN->getValueType(0);
		uint64_t CNImm = CN->getZExtValue();
		// If this address fits entirely in a 16-bit sext immediate field, codegen
		// this as "d, 0".
		int16_t Imm;
		if (isIntS16Immediate(CN, Imm) && (!Align \|\| isAligned(*Align, Imm))) {
		Disp = DAG.getTargetConstant(Imm, DL, CNType);
		Base = DAG.getRegister(Subtarget.isPPC64() ? PPC::ZERO8 : PPC::ZERO,
		CNType);
		break;
		}
		// Handle 32-bit sext immediate with LIS + Addr mode.
		if ((CNType == MVT::i32 \|\| isInt<32>(CNImm)) &&
		(!Align \|\| isAligned(*Align, CNImm))) {
		int32_t Addr = (int32_t)CNImm;
		// Otherwise, break this down into LIS + Disp.
		Disp = DAG.getTargetConstant((int16_t)Addr, DL, MVT::i32);
		Base =
		DAG.getTargetConstant((Addr - (int16_t)Addr) >> 16, DL, MVT::i32);
		uint32_t LIS = CNType == MVT::i32 ? PPC::LIS : PPC::LIS8;
		Base = SDValue(DAG.getMachineNode(LIS, DL, CNType, Base), 0);
		break;
		}
		}
		// Otherwise, the PPC:MOF_NotAdd flag is set. Load/Store is Non-foldable.
		Disp = DAG.getTargetConstant(0, DL, getPointerTy(DAG.getDataLayout()));
		if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(N)) {
		Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());
		fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());
		} else
		Base = N;
		break;
		}
		case PPC::AM_None:
		break;
		default: { // By default, X-Form is always available to be selected.
		// When a frame index is not aligned, we also match by XForm.
		FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(N);
		Base = FI ? N : N.getOperand(1);
		Disp = FI ? DAG.getRegister(Subtarget.isPPC64() ? PPC::ZERO8 : PPC::ZERO,
		N.getValueType())
		: N.getOperand(0);
		break;
		}
		}
		return Mode;
		}

llvm/lib/Target/PowerPC/PPCInstr64Bit.td

Show First 20 Lines • Show All 1,056 Lines • ▼ Show 20 Lines
//		//


// Sign extending loads.		// Sign extending loads.
let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
let Interpretation64Bit = 1, isCodeGenOnly = 1 in		let Interpretation64Bit = 1, isCodeGenOnly = 1 in
def LHA8: DForm_1<42, (outs g8rc:$rD), (ins memri:$src),		def LHA8: DForm_1<42, (outs g8rc:$rD), (ins memri:$src),
"lha $rD, $src", IIC_LdStLHA,		"lha $rD, $src", IIC_LdStLHA,
[(set i64:$rD, (sextloadi16 iaddr:$src))]>,		[(set i64:$rD, (sextloadi16 DForm:$src))]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def LWA : DSForm_1<58, 2, (outs g8rc:$rD), (ins memrix:$src),		def LWA : DSForm_1<58, 2, (outs g8rc:$rD), (ins memrix:$src),
"lwa $rD, $src", IIC_LdStLWA,		"lwa $rD, $src", IIC_LdStLWA,
[(set i64:$rD,		[(set i64:$rD,
(DSFormSextLoadi32 iaddrX4:$src))]>, isPPC64,		(sextloadi32 DSForm:$src))]>, isPPC64,
		steven.zhangUnsubmitted Not Done Reply Inline Actions Not sure if this is handled correctly as you are removing the align restrict for LWA. Technical speaking, we need to remove all such kind of alignment restrict in the load/store as far as we did some analysis in the source code. But I notice that we are removing the align restrict here but still keep it for LD. steven.zhang: Not sure if this is handled correctly as you are removing the align restrict for LWA. Technical…
		amykAuthorUnsubmitted Done Reply Inline Actions You're right; I meant to remove the `align`/`unalign` in the other `LD`,`LWA` patterns, as well, as the new load/store infrastructure is meant to use load/sextload/zextload and compute alignment based on the flags. amyk: You're right; I meant to remove the `align`/`unalign` in the other `LD`,`LWA` patterns, as well…
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
let Interpretation64Bit = 1, isCodeGenOnly = 1 in		let Interpretation64Bit = 1, isCodeGenOnly = 1 in
def LHAX8: XForm_1_memOp<31, 343, (outs g8rc:$rD), (ins memrr:$src),		def LHAX8: XForm_1_memOp<31, 343, (outs g8rc:$rD), (ins memrr:$src),
"lhax $rD, $src", IIC_LdStLHA,		"lhax $rD, $src", IIC_LdStLHA,
[(set i64:$rD, (sextloadi16 xaddr:$src))]>,		[(set i64:$rD, (sextloadi16 XForm:$src))]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def LWAX : XForm_1_memOp<31, 341, (outs g8rc:$rD), (ins memrr:$src),		def LWAX : XForm_1_memOp<31, 341, (outs g8rc:$rD), (ins memrr:$src),
"lwax $rD, $src", IIC_LdStLHA,		"lwax $rD, $src", IIC_LdStLHA,
[(set i64:$rD, (sextloadi32 xaddrX4:$src))]>, isPPC64,		[(set i64:$rD, (sextloadi32 XForm:$src))]>, isPPC64,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
// For fast-isel:		// For fast-isel:
let isCodeGenOnly = 1, mayLoad = 1, hasSideEffects = 0 in {		let isCodeGenOnly = 1, mayLoad = 1, hasSideEffects = 0 in {
def LWA_32 : DSForm_1<58, 2, (outs gprc:$rD), (ins memrix:$src),		def LWA_32 : DSForm_1<58, 2, (outs gprc:$rD), (ins memrix:$src),
"lwa $rD, $src", IIC_LdStLWA, []>, isPPC64,		"lwa $rD, $src", IIC_LdStLWA, []>, isPPC64,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def LWAX_32 : XForm_1_memOp<31, 341, (outs gprc:$rD), (ins memrr:$src),		def LWAX_32 : XForm_1_memOp<31, 341, (outs gprc:$rD), (ins memrr:$src),
"lwax $rD, $src", IIC_LdStLHA, []>, isPPC64,		"lwax $rD, $src", IIC_LdStLHA, []>, isPPC64,
Show All 24 Lines
}		}
}		}

let Interpretation64Bit = 1, isCodeGenOnly = 1 in {		let Interpretation64Bit = 1, isCodeGenOnly = 1 in {
// Zero extending loads.		// Zero extending loads.
let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
def LBZ8 : DForm_1<34, (outs g8rc:$rD), (ins memri:$src),		def LBZ8 : DForm_1<34, (outs g8rc:$rD), (ins memri:$src),
"lbz $rD, $src", IIC_LdStLoad,		"lbz $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi8 iaddr:$src))]>;		[(set i64:$rD, (zextloadi8 DForm:$src))]>;
def LHZ8 : DForm_1<40, (outs g8rc:$rD), (ins memri:$src),		def LHZ8 : DForm_1<40, (outs g8rc:$rD), (ins memri:$src),
"lhz $rD, $src", IIC_LdStLoad,		"lhz $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi16 iaddr:$src))]>;		[(set i64:$rD, (zextloadi16 DForm:$src))]>;
def LWZ8 : DForm_1<32, (outs g8rc:$rD), (ins memri:$src),		def LWZ8 : DForm_1<32, (outs g8rc:$rD), (ins memri:$src),
"lwz $rD, $src", IIC_LdStLoad,		"lwz $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi32 iaddr:$src))]>, isPPC64;		[(set i64:$rD, (zextloadi32 DForm:$src))]>, isPPC64;

def LBZX8 : XForm_1_memOp<31, 87, (outs g8rc:$rD), (ins memrr:$src),		def LBZX8 : XForm_1_memOp<31, 87, (outs g8rc:$rD), (ins memrr:$src),
"lbzx $rD, $src", IIC_LdStLoad,		"lbzx $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi8 xaddr:$src))]>;		[(set i64:$rD, (zextloadi8 XForm:$src))]>;
def LHZX8 : XForm_1_memOp<31, 279, (outs g8rc:$rD), (ins memrr:$src),		def LHZX8 : XForm_1_memOp<31, 279, (outs g8rc:$rD), (ins memrr:$src),
"lhzx $rD, $src", IIC_LdStLoad,		"lhzx $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi16 xaddr:$src))]>;		[(set i64:$rD, (zextloadi16 XForm:$src))]>;
def LWZX8 : XForm_1_memOp<31, 23, (outs g8rc:$rD), (ins memrr:$src),		def LWZX8 : XForm_1_memOp<31, 23, (outs g8rc:$rD), (ins memrr:$src),
"lwzx $rD, $src", IIC_LdStLoad,		"lwzx $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (zextloadi32 xaddr:$src))]>;		[(set i64:$rD, (zextloadi32 XForm:$src))]>;


// Update forms.		// Update forms.
let mayLoad = 1, hasSideEffects = 0 in {		let mayLoad = 1, hasSideEffects = 0 in {
def LBZU8 : DForm_1<35, (outs g8rc:$rD, ptr_rc_nor0:$ea_result),		def LBZU8 : DForm_1<35, (outs g8rc:$rD, ptr_rc_nor0:$ea_result),
(ins memri:$addr),		(ins memri:$addr),
"lbzu $rD, $addr", IIC_LdStLoadUpd,		"lbzu $rD, $addr", IIC_LdStLoadUpd,
[]>, RegConstraint<"$addr.reg = $ea_result">,		[]>, RegConstraint<"$addr.reg = $ea_result">,
Show All 28 Lines
}		}
} // Interpretation64Bit		} // Interpretation64Bit


// Full 8-byte loads.		// Full 8-byte loads.
let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
def LD : DSForm_1<58, 0, (outs g8rc:$rD), (ins memrix:$src),		def LD : DSForm_1<58, 0, (outs g8rc:$rD), (ins memrix:$src),
"ld $rD, $src", IIC_LdStLD,		"ld $rD, $src", IIC_LdStLD,
[(set i64:$rD, (DSFormLoad iaddrX4:$src))]>, isPPC64;		[(set i64:$rD, (load DSForm:$src))]>, isPPC64;
// The following four definitions are selected for small code model only.		// The following four definitions are selected for small code model only.
// Otherwise, we need to create two instructions to form a 32-bit offset,		// Otherwise, we need to create two instructions to form a 32-bit offset,
// so we have a custom matcher for TOC_ENTRY in PPCDAGToDAGIsel::Select().		// so we have a custom matcher for TOC_ENTRY in PPCDAGToDAGIsel::Select().
def LDtoc: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),		def LDtoc: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),
"#LDtoc",		"#LDtoc",
[(set i64:$rD,		[(set i64:$rD,
(PPCtoc_entry tglobaladdr:$disp, i64:$reg))]>, isPPC64;		(PPCtoc_entry tglobaladdr:$disp, i64:$reg))]>, isPPC64;
def LDtocJTI: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),		def LDtocJTI: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),
"#LDtocJTI",		"#LDtocJTI",
[(set i64:$rD,		[(set i64:$rD,
(PPCtoc_entry tjumptable:$disp, i64:$reg))]>, isPPC64;		(PPCtoc_entry tjumptable:$disp, i64:$reg))]>, isPPC64;
def LDtocCPT: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),		def LDtocCPT: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),
"#LDtocCPT",		"#LDtocCPT",
[(set i64:$rD,		[(set i64:$rD,
(PPCtoc_entry tconstpool:$disp, i64:$reg))]>, isPPC64;		(PPCtoc_entry tconstpool:$disp, i64:$reg))]>, isPPC64;
def LDtocBA: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),		def LDtocBA: PPCEmitTimePseudo<(outs g8rc:$rD), (ins tocentry:$disp, g8rc:$reg),
"#LDtocCPT",		"#LDtocCPT",
[(set i64:$rD,		[(set i64:$rD,
(PPCtoc_entry tblockaddress:$disp, i64:$reg))]>, isPPC64;		(PPCtoc_entry tblockaddress:$disp, i64:$reg))]>, isPPC64;

def LDX : XForm_1_memOp<31, 21, (outs g8rc:$rD), (ins memrr:$src),		def LDX : XForm_1_memOp<31, 21, (outs g8rc:$rD), (ins memrr:$src),
"ldx $rD, $src", IIC_LdStLD,		"ldx $rD, $src", IIC_LdStLD,
[(set i64:$rD, (load xaddrX4:$src))]>, isPPC64;		[(set i64:$rD, (load XForm:$src))]>, isPPC64;
def LDBRX : XForm_1_memOp<31, 532, (outs g8rc:$rD), (ins memrr:$src),		def LDBRX : XForm_1_memOp<31, 532, (outs g8rc:$rD), (ins memrr:$src),
"ldbrx $rD, $src", IIC_LdStLoad,		"ldbrx $rD, $src", IIC_LdStLoad,
[(set i64:$rD, (PPClbrx xoaddr:$src, i64))]>, isPPC64;		[(set i64:$rD, (PPClbrx ForceXForm:$src, i64))]>, isPPC64;

let mayLoad = 1, hasSideEffects = 0, isCodeGenOnly = 1 in {		let mayLoad = 1, hasSideEffects = 0, isCodeGenOnly = 1 in {
def LHBRX8 : XForm_1_memOp<31, 790, (outs g8rc:$rD), (ins memrr:$src),		def LHBRX8 : XForm_1_memOp<31, 790, (outs g8rc:$rD), (ins memrr:$src),
"lhbrx $rD, $src", IIC_LdStLoad, []>;		"lhbrx $rD, $src", IIC_LdStLoad, []>;
def LWBRX8 : XForm_1_memOp<31, 534, (outs g8rc:$rD), (ins memrr:$src),		def LWBRX8 : XForm_1_memOp<31, 534, (outs g8rc:$rD), (ins memrr:$src),
"lwbrx $rD, $src", IIC_LdStLoad, []>;		"lwbrx $rD, $src", IIC_LdStLoad, []>;
}		}

▲ Show 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	def PADDIdtprel : PPCEmitTimePseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
(PPCpaddiDtprel i64:$reg, tglobaltlsaddr:$disp))]>,		(PPCpaddiDtprel i64:$reg, tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;

let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
let Interpretation64Bit = 1, isCodeGenOnly = 1 in {		let Interpretation64Bit = 1, isCodeGenOnly = 1 in {
// Truncating stores.		// Truncating stores.
def STB8 : DForm_1<38, (outs), (ins g8rc:$rS, memri:$src),		def STB8 : DForm_1<38, (outs), (ins g8rc:$rS, memri:$src),
"stb $rS, $src", IIC_LdStStore,		"stb $rS, $src", IIC_LdStStore,
[(truncstorei8 i64:$rS, iaddr:$src)]>;		[(truncstorei8 i64:$rS, DForm:$src)]>;
def STH8 : DForm_1<44, (outs), (ins g8rc:$rS, memri:$src),		def STH8 : DForm_1<44, (outs), (ins g8rc:$rS, memri:$src),
"sth $rS, $src", IIC_LdStStore,		"sth $rS, $src", IIC_LdStStore,
[(truncstorei16 i64:$rS, iaddr:$src)]>;		[(truncstorei16 i64:$rS, DForm:$src)]>;
def STW8 : DForm_1<36, (outs), (ins g8rc:$rS, memri:$src),		def STW8 : DForm_1<36, (outs), (ins g8rc:$rS, memri:$src),
"stw $rS, $src", IIC_LdStStore,		"stw $rS, $src", IIC_LdStStore,
[(truncstorei32 i64:$rS, iaddr:$src)]>;		[(truncstorei32 i64:$rS, DForm:$src)]>;
def STBX8 : XForm_8_memOp<31, 215, (outs), (ins g8rc:$rS, memrr:$dst),		def STBX8 : XForm_8_memOp<31, 215, (outs), (ins g8rc:$rS, memrr:$dst),
"stbx $rS, $dst", IIC_LdStStore,		"stbx $rS, $dst", IIC_LdStStore,
[(truncstorei8 i64:$rS, xaddr:$dst)]>,		[(truncstorei8 i64:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STHX8 : XForm_8_memOp<31, 407, (outs), (ins g8rc:$rS, memrr:$dst),		def STHX8 : XForm_8_memOp<31, 407, (outs), (ins g8rc:$rS, memrr:$dst),
"sthx $rS, $dst", IIC_LdStStore,		"sthx $rS, $dst", IIC_LdStStore,
[(truncstorei16 i64:$rS, xaddr:$dst)]>,		[(truncstorei16 i64:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STWX8 : XForm_8_memOp<31, 151, (outs), (ins g8rc:$rS, memrr:$dst),		def STWX8 : XForm_8_memOp<31, 151, (outs), (ins g8rc:$rS, memrr:$dst),
"stwx $rS, $dst", IIC_LdStStore,		"stwx $rS, $dst", IIC_LdStStore,
[(truncstorei32 i64:$rS, xaddr:$dst)]>,		[(truncstorei32 i64:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
} // Interpretation64Bit		} // Interpretation64Bit

// Normal 8-byte stores.		// Normal 8-byte stores.
def STD : DSForm_1<62, 0, (outs), (ins g8rc:$rS, memrix:$dst),		def STD : DSForm_1<62, 0, (outs), (ins g8rc:$rS, memrix:$dst),
"std $rS, $dst", IIC_LdStSTD,		"std $rS, $dst", IIC_LdStSTD,
[(DSFormStore i64:$rS, iaddrX4:$dst)]>, isPPC64;		[(store i64:$rS, DSForm:$dst)]>, isPPC64;
def STDX : XForm_8_memOp<31, 149, (outs), (ins g8rc:$rS, memrr:$dst),		def STDX : XForm_8_memOp<31, 149, (outs), (ins g8rc:$rS, memrr:$dst),
"stdx $rS, $dst", IIC_LdStSTD,		"stdx $rS, $dst", IIC_LdStSTD,
[(store i64:$rS, xaddrX4:$dst)]>, isPPC64,		[(store i64:$rS, XForm:$dst)]>, isPPC64,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STDBRX: XForm_8_memOp<31, 660, (outs), (ins g8rc:$rS, memrr:$dst),		def STDBRX: XForm_8_memOp<31, 660, (outs), (ins g8rc:$rS, memrr:$dst),
"stdbrx $rS, $dst", IIC_LdStStore,		"stdbrx $rS, $dst", IIC_LdStStore,
[(PPCstbrx i64:$rS, xoaddr:$dst, i64)]>, isPPC64,		[(PPCstbrx i64:$rS, ForceXForm:$dst, i64)]>, isPPC64,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
}		}

// Stores with Update (pre-inc).		// Stores with Update (pre-inc).
let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {		let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {
let Interpretation64Bit = 1, isCodeGenOnly = 1 in {		let Interpretation64Bit = 1, isCodeGenOnly = 1 in {
def STBU8 : DForm_1<39, (outs ptr_rc_nor0:$ea_res), (ins g8rc:$rS, memri:$dst),		def STBU8 : DForm_1<39, (outs ptr_rc_nor0:$ea_res), (ins g8rc:$rS, memri:$dst),
"stbu $rS, $dst", IIC_LdStSTU, []>,		"stbu $rS, $dst", IIC_LdStSTU, []>,
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines
// (we could use the default xori pattern, but nor has lower latency on some		// (we could use the default xori pattern, but nor has lower latency on some
// cores (such as the A2)).		// cores (such as the A2)).
def i64not : OutPatFrag<(ops node:$in),		def i64not : OutPatFrag<(ops node:$in),
(NOR8 $in, $in)>;		(NOR8 $in, $in)>;
def : Pat<(not i64:$in),		def : Pat<(not i64:$in),
(i64not $in)>;		(i64not $in)>;

// Extending loads with i64 targets.		// Extending loads with i64 targets.
def : Pat<(zextloadi1 iaddr:$src),		def : Pat<(zextloadi1 DForm:$src),
(LBZ8 iaddr:$src)>;		(LBZ8 DForm:$src)>;
def : Pat<(zextloadi1 xaddr:$src),		def : Pat<(zextloadi1 XForm:$src),
(LBZX8 xaddr:$src)>;		(LBZX8 XForm:$src)>;
def : Pat<(extloadi1 iaddr:$src),		def : Pat<(extloadi1 DForm:$src),
(LBZ8 iaddr:$src)>;		(LBZ8 DForm:$src)>;
def : Pat<(extloadi1 xaddr:$src),		def : Pat<(extloadi1 XForm:$src),
(LBZX8 xaddr:$src)>;		(LBZX8 XForm:$src)>;
def : Pat<(extloadi8 iaddr:$src),		def : Pat<(extloadi8 DForm:$src),
(LBZ8 iaddr:$src)>;		(LBZ8 DForm:$src)>;
def : Pat<(extloadi8 xaddr:$src),		def : Pat<(extloadi8 XForm:$src),
(LBZX8 xaddr:$src)>;		(LBZX8 XForm:$src)>;
def : Pat<(extloadi16 iaddr:$src),		def : Pat<(extloadi16 DForm:$src),
(LHZ8 iaddr:$src)>;		(LHZ8 DForm:$src)>;
def : Pat<(extloadi16 xaddr:$src),		def : Pat<(extloadi16 XForm:$src),
(LHZX8 xaddr:$src)>;		(LHZX8 XForm:$src)>;
def : Pat<(extloadi32 iaddr:$src),		def : Pat<(extloadi32 DForm:$src),
(LWZ8 iaddr:$src)>;		(LWZ8 DForm:$src)>;
def : Pat<(extloadi32 xaddr:$src),		def : Pat<(extloadi32 XForm:$src),
(LWZX8 xaddr:$src)>;		(LWZX8 XForm:$src)>;

// Standard shifts. These are represented separately from the real shifts above		// Standard shifts. These are represented separately from the real shifts above
// so that we can distinguish between shifts that allow 6-bit and 7-bit shift		// so that we can distinguish between shifts that allow 6-bit and 7-bit shift
// amounts.		// amounts.
def : Pat<(sra i64:$rS, i32:$rB),		def : Pat<(sra i64:$rS, i32:$rB),
(SRAD $rS, $rB)>;		(SRAD $rS, $rB)>;
def : Pat<(srl i64:$rS, i32:$rB),		def : Pat<(srl i64:$rS, i32:$rB),
(SRD $rS, $rB)>;		(SRD $rS, $rB)>;
Show All 37 Lines	def : Pat<(add i64:$in, (PPChi tjumptable:$g, 0)),
(ADDIS8 $in, tjumptable:$g)>;		(ADDIS8 $in, tjumptable:$g)>;
def : Pat<(add i64:$in, (PPChi tblockaddress:$g, 0)),		def : Pat<(add i64:$in, (PPChi tblockaddress:$g, 0)),
(ADDIS8 $in, tblockaddress:$g)>;		(ADDIS8 $in, tblockaddress:$g)>;

// AIX 64-bit small code model TLS access.		// AIX 64-bit small code model TLS access.
def : Pat<(i64 (PPCtoc_entry tglobaltlsaddr:$disp, i64:$reg)),		def : Pat<(i64 (PPCtoc_entry tglobaltlsaddr:$disp, i64:$reg)),
(i64 (LDtoc tglobaltlsaddr:$disp, i64:$reg))>;		(i64 (LDtoc tglobaltlsaddr:$disp, i64:$reg))>;

// Patterns to match r+r indexed loads and stores for
// addresses without at least 4-byte alignment.
def : Pat<(i64 (NonDSFormSextLoadi32 xoaddr:$src)),
(LWAX xoaddr:$src)>;
def : Pat<(i64 (NonDSFormLoad xoaddr:$src)),
(LDX xoaddr:$src)>;
def : Pat<(NonDSFormStore i64:$rS, xoaddr:$dst),
(STDX $rS, xoaddr:$dst)>;

// 64-bits atomic loads and stores		// 64-bits atomic loads and stores
def : Pat<(atomic_load_64 iaddrX4:$src), (LD memrix:$src)>;		def : Pat<(atomic_load_64 iaddrX4:$src), (LD memrix:$src)>;
def : Pat<(atomic_load_64 xaddrX4:$src), (LDX memrr:$src)>;		def : Pat<(atomic_load_64 xaddrX4:$src), (LDX memrr:$src)>;

def : Pat<(atomic_store_64 iaddrX4:$ptr, i64:$val), (STD g8rc:$val, memrix:$ptr)>;		def : Pat<(atomic_store_64 iaddrX4:$ptr, i64:$val), (STD g8rc:$val, memrix:$ptr)>;
def : Pat<(atomic_store_64 xaddrX4:$ptr, i64:$val), (STDX g8rc:$val, memrr:$ptr)>;		def : Pat<(atomic_store_64 xaddrX4:$ptr, i64:$val), (STDX g8rc:$val, memrr:$ptr)>;

let Predicates = [IsISA3_0] in {		let Predicates = [IsISA3_0] in {
Show All 30 Lines

llvm/lib/Target/PowerPC/PPCInstrAltivec.td

Show First 20 Lines • Show All 405 Lines • ▼ Show 20 Lines	let hasSideEffects = 1 in {
def MTVSCR : VXForm_5<1604, (outs), (ins vrrc:$vB),		def MTVSCR : VXForm_5<1604, (outs), (ins vrrc:$vB),
"mtvscr $vB", IIC_LdStLoad,		"mtvscr $vB", IIC_LdStLoad,
[(int_ppc_altivec_mtvscr v4i32:$vB)]>;		[(int_ppc_altivec_mtvscr v4i32:$vB)]>;
}		}

let PPC970_Unit = 2, mayLoad = 1, mayStore = 0 in { // Loads.		let PPC970_Unit = 2, mayLoad = 1, mayStore = 0 in { // Loads.
def LVEBX: XForm_1_memOp<31, 7, (outs vrrc:$vD), (ins memrr:$src),		def LVEBX: XForm_1_memOp<31, 7, (outs vrrc:$vD), (ins memrr:$src),
"lvebx $vD, $src", IIC_LdStLoad,		"lvebx $vD, $src", IIC_LdStLoad,
[(set v16i8:$vD, (int_ppc_altivec_lvebx xoaddr:$src))]>;		[(set v16i8:$vD, (int_ppc_altivec_lvebx ForceXForm:$src))]>;
def LVEHX: XForm_1_memOp<31, 39, (outs vrrc:$vD), (ins memrr:$src),		def LVEHX: XForm_1_memOp<31, 39, (outs vrrc:$vD), (ins memrr:$src),
"lvehx $vD, $src", IIC_LdStLoad,		"lvehx $vD, $src", IIC_LdStLoad,
[(set v8i16:$vD, (int_ppc_altivec_lvehx xoaddr:$src))]>;		[(set v8i16:$vD, (int_ppc_altivec_lvehx ForceXForm:$src))]>;
def LVEWX: XForm_1_memOp<31, 71, (outs vrrc:$vD), (ins memrr:$src),		def LVEWX: XForm_1_memOp<31, 71, (outs vrrc:$vD), (ins memrr:$src),
"lvewx $vD, $src", IIC_LdStLoad,		"lvewx $vD, $src", IIC_LdStLoad,
[(set v4i32:$vD, (int_ppc_altivec_lvewx xoaddr:$src))]>;		[(set v4i32:$vD, (int_ppc_altivec_lvewx ForceXForm:$src))]>;
def LVX : XForm_1_memOp<31, 103, (outs vrrc:$vD), (ins memrr:$src),		def LVX : XForm_1_memOp<31, 103, (outs vrrc:$vD), (ins memrr:$src),
"lvx $vD, $src", IIC_LdStLoad,		"lvx $vD, $src", IIC_LdStLoad,
[(set v4i32:$vD, (int_ppc_altivec_lvx xoaddr:$src))]>;		[(set v4i32:$vD, (int_ppc_altivec_lvx ForceXForm:$src))]>;
def LVXL : XForm_1_memOp<31, 359, (outs vrrc:$vD), (ins memrr:$src),		def LVXL : XForm_1_memOp<31, 359, (outs vrrc:$vD), (ins memrr:$src),
"lvxl $vD, $src", IIC_LdStLoad,		"lvxl $vD, $src", IIC_LdStLoad,
[(set v4i32:$vD, (int_ppc_altivec_lvxl xoaddr:$src))]>;		[(set v4i32:$vD, (int_ppc_altivec_lvxl ForceXForm:$src))]>;
}		}

def LVSL : XForm_1_memOp<31, 6, (outs vrrc:$vD), (ins memrr:$src),		def LVSL : XForm_1_memOp<31, 6, (outs vrrc:$vD), (ins memrr:$src),
"lvsl $vD, $src", IIC_LdStLoad,		"lvsl $vD, $src", IIC_LdStLoad,
[(set v16i8:$vD, (int_ppc_altivec_lvsl xoaddr:$src))]>,		[(set v16i8:$vD, (int_ppc_altivec_lvsl ForceXForm:$src))]>,
PPC970_Unit_LSU;		PPC970_Unit_LSU;
def LVSR : XForm_1_memOp<31, 38, (outs vrrc:$vD), (ins memrr:$src),		def LVSR : XForm_1_memOp<31, 38, (outs vrrc:$vD), (ins memrr:$src),
"lvsr $vD, $src", IIC_LdStLoad,		"lvsr $vD, $src", IIC_LdStLoad,
[(set v16i8:$vD, (int_ppc_altivec_lvsr xoaddr:$src))]>,		[(set v16i8:$vD, (int_ppc_altivec_lvsr ForceXForm:$src))]>,
PPC970_Unit_LSU;		PPC970_Unit_LSU;

let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in { // Stores.		let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in { // Stores.
def STVEBX: XForm_8_memOp<31, 135, (outs), (ins vrrc:$rS, memrr:$dst),		def STVEBX: XForm_8_memOp<31, 135, (outs), (ins vrrc:$rS, memrr:$dst),
"stvebx $rS, $dst", IIC_LdStStore,		"stvebx $rS, $dst", IIC_LdStStore,
[(int_ppc_altivec_stvebx v16i8:$rS, xoaddr:$dst)]>;		[(int_ppc_altivec_stvebx v16i8:$rS, ForceXForm:$dst)]>;
def STVEHX: XForm_8_memOp<31, 167, (outs), (ins vrrc:$rS, memrr:$dst),		def STVEHX: XForm_8_memOp<31, 167, (outs), (ins vrrc:$rS, memrr:$dst),
"stvehx $rS, $dst", IIC_LdStStore,		"stvehx $rS, $dst", IIC_LdStStore,
[(int_ppc_altivec_stvehx v8i16:$rS, xoaddr:$dst)]>;		[(int_ppc_altivec_stvehx v8i16:$rS, ForceXForm:$dst)]>;
def STVEWX: XForm_8_memOp<31, 199, (outs), (ins vrrc:$rS, memrr:$dst),		def STVEWX: XForm_8_memOp<31, 199, (outs), (ins vrrc:$rS, memrr:$dst),
"stvewx $rS, $dst", IIC_LdStStore,		"stvewx $rS, $dst", IIC_LdStStore,
[(int_ppc_altivec_stvewx v4i32:$rS, xoaddr:$dst)]>;		[(int_ppc_altivec_stvewx v4i32:$rS, ForceXForm:$dst)]>;
def STVX : XForm_8_memOp<31, 231, (outs), (ins vrrc:$rS, memrr:$dst),		def STVX : XForm_8_memOp<31, 231, (outs), (ins vrrc:$rS, memrr:$dst),
"stvx $rS, $dst", IIC_LdStStore,		"stvx $rS, $dst", IIC_LdStStore,
[(int_ppc_altivec_stvx v4i32:$rS, xoaddr:$dst)]>;		[(int_ppc_altivec_stvx v4i32:$rS, ForceXForm:$dst)]>;
def STVXL : XForm_8_memOp<31, 487, (outs), (ins vrrc:$rS, memrr:$dst),		def STVXL : XForm_8_memOp<31, 487, (outs), (ins vrrc:$rS, memrr:$dst),
"stvxl $rS, $dst", IIC_LdStStore,		"stvxl $rS, $dst", IIC_LdStStore,
[(int_ppc_altivec_stvxl v4i32:$rS, xoaddr:$dst)]>;		[(int_ppc_altivec_stvxl v4i32:$rS, ForceXForm:$dst)]>;
}		}

let PPC970_Unit = 5 in { // VALU Operations.		let PPC970_Unit = 5 in { // VALU Operations.
// VA-Form instructions. 3-input AltiVec ops.		// VA-Form instructions. 3-input AltiVec ops.
let isCommutable = 1 in {		let isCommutable = 1 in {
def VMADDFP : VAForm_1<46, (outs vrrc:$vD), (ins vrrc:$vA, vrrc:$vC, vrrc:$vB),		def VMADDFP : VAForm_1<46, (outs vrrc:$vD), (ins vrrc:$vA, vrrc:$vC, vrrc:$vB),
"vmaddfp $vD, $vA, $vC, $vB", IIC_VecFP,		"vmaddfp $vD, $vA, $vC, $vB", IIC_VecFP,
[(set v4f32:$vD,		[(set v4f32:$vD,
▲ Show 20 Lines • Show All 427 Lines • ▼ Show 20 Lines
def : Pat<(v16i8 (ssubsat v16i8:$vA, v16i8:$vB)), (v16i8 (VSUBSBS $vA, $vB))>;		def : Pat<(v16i8 (ssubsat v16i8:$vA, v16i8:$vB)), (v16i8 (VSUBSBS $vA, $vB))>;
def : Pat<(v16i8 (usubsat v16i8:$vA, v16i8:$vB)), (v16i8 (VSUBUBS $vA, $vB))>;		def : Pat<(v16i8 (usubsat v16i8:$vA, v16i8:$vB)), (v16i8 (VSUBUBS $vA, $vB))>;
def : Pat<(v8i16 (ssubsat v8i16:$vA, v8i16:$vB)), (v8i16 (VSUBSHS $vA, $vB))>;		def : Pat<(v8i16 (ssubsat v8i16:$vA, v8i16:$vB)), (v8i16 (VSUBSHS $vA, $vB))>;
def : Pat<(v8i16 (usubsat v8i16:$vA, v8i16:$vB)), (v8i16 (VSUBUHS $vA, $vB))>;		def : Pat<(v8i16 (usubsat v8i16:$vA, v8i16:$vB)), (v8i16 (VSUBUHS $vA, $vB))>;
def : Pat<(v4i32 (ssubsat v4i32:$vA, v4i32:$vB)), (v4i32 (VSUBSWS $vA, $vB))>;		def : Pat<(v4i32 (ssubsat v4i32:$vA, v4i32:$vB)), (v4i32 (VSUBSWS $vA, $vB))>;
def : Pat<(v4i32 (usubsat v4i32:$vA, v4i32:$vB)), (v4i32 (VSUBUWS $vA, $vB))>;		def : Pat<(v4i32 (usubsat v4i32:$vA, v4i32:$vB)), (v4i32 (VSUBUWS $vA, $vB))>;

// Loads.		// Loads.
def : Pat<(v4i32 (load xoaddr:$src)), (LVX xoaddr:$src)>;		def : Pat<(v4i32 (load ForceXForm:$src)), (LVX ForceXForm:$src)>;

// Stores.		// Stores.
def : Pat<(store v4i32:$rS, xoaddr:$dst),		def : Pat<(store v4i32:$rS, ForceXForm:$dst),
(STVX $rS, xoaddr:$dst)>;		(STVX $rS, ForceXForm:$dst)>;

// Bit conversions.		// Bit conversions.
def : Pat<(v16i8 (bitconvert (v8i16 VRRC:$src))), (v16i8 VRRC:$src)>;		def : Pat<(v16i8 (bitconvert (v8i16 VRRC:$src))), (v16i8 VRRC:$src)>;
def : Pat<(v16i8 (bitconvert (v4i32 VRRC:$src))), (v16i8 VRRC:$src)>;		def : Pat<(v16i8 (bitconvert (v4i32 VRRC:$src))), (v16i8 VRRC:$src)>;
def : Pat<(v16i8 (bitconvert (v4f32 VRRC:$src))), (v16i8 VRRC:$src)>;		def : Pat<(v16i8 (bitconvert (v4f32 VRRC:$src))), (v16i8 VRRC:$src)>;
def : Pat<(v16i8 (bitconvert (v2i64 VRRC:$src))), (v16i8 VRRC:$src)>;		def : Pat<(v16i8 (bitconvert (v2i64 VRRC:$src))), (v16i8 VRRC:$src)>;
def : Pat<(v16i8 (bitconvert (v1i128 VRRC:$src))), (v16i8 VRRC:$src)>;		def : Pat<(v16i8 (bitconvert (v1i128 VRRC:$src))), (v16i8 VRRC:$src)>;

▲ Show 20 Lines • Show All 719 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.td

Show First 20 Lines • Show All 1,137 Lines • ▼ Show 20 Lines
def addr : ComplexPattern<iPTR, 1, "SelectAddr",[], []>;		def addr : ComplexPattern<iPTR, 1, "SelectAddr",[], []>;

/// This is just the offset part of iaddr, used for preinc.		/// This is just the offset part of iaddr, used for preinc.
def iaddroff : ComplexPattern<iPTR, 1, "SelectAddrImmOffs", [], []>;		def iaddroff : ComplexPattern<iPTR, 1, "SelectAddrImmOffs", [], []>;

// PC Relative Address		// PC Relative Address
def pcreladdr : ComplexPattern<iPTR, 1, "SelectAddrPCRel", [], []>;		def pcreladdr : ComplexPattern<iPTR, 1, "SelectAddrPCRel", [], []>;

		// Load and Store Instruction Selection addressing modes.
		def DForm : ComplexPattern<iPTR, 2, "SelectDForm", [], [SDNPWantParent]>;
		def DSForm : ComplexPattern<iPTR, 2, "SelectDSForm", [], [SDNPWantParent]>;
		def DQForm : ComplexPattern<iPTR, 2, "SelectDQForm", [], [SDNPWantParent]>;
		def XForm : ComplexPattern<iPTR, 2, "SelectXForm", [], [SDNPWantParent]>;
		def ForceXForm : ComplexPattern<iPTR, 2, "SelectForceXForm", [], [SDNPWantParent]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PowerPC Instruction Predicate Definitions.		// PowerPC Instruction Predicate Definitions.
def In32BitMode : Predicate<"!Subtarget->isPPC64()">;		def In32BitMode : Predicate<"!Subtarget->isPPC64()">;
def In64BitMode : Predicate<"Subtarget->isPPC64()">;		def In64BitMode : Predicate<"Subtarget->isPPC64()">;
def IsBookE : Predicate<"Subtarget->isBookE()">;		def IsBookE : Predicate<"Subtarget->isBookE()">;
def IsNotBookE : Predicate<"!Subtarget->isBookE()">;		def IsNotBookE : Predicate<"!Subtarget->isBookE()">;
def HasOnlyMSYNC : Predicate<"Subtarget->hasOnlyMSYNC()">;		def HasOnlyMSYNC : Predicate<"Subtarget->hasOnlyMSYNC()">;
def HasSYNC : Predicate<"!Subtarget->hasOnlyMSYNC()">;		def HasSYNC : Predicate<"!Subtarget->hasOnlyMSYNC()">;
▲ Show 20 Lines • Show All 1,062 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PPC32 Load Instructions.		// PPC32 Load Instructions.
//		//

// Unindexed (r+i) Loads.		// Unindexed (r+i) Loads.
let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
def LBZ : DForm_1<34, (outs gprc:$rD), (ins memri:$src),		def LBZ : DForm_1<34, (outs gprc:$rD), (ins memri:$src),
"lbz $rD, $src", IIC_LdStLoad,		"lbz $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (zextloadi8 iaddr:$src))]>;		[(set i32:$rD, (zextloadi8 DForm:$src))]>;
def LHA : DForm_1<42, (outs gprc:$rD), (ins memri:$src),		def LHA : DForm_1<42, (outs gprc:$rD), (ins memri:$src),
"lha $rD, $src", IIC_LdStLHA,		"lha $rD, $src", IIC_LdStLHA,
[(set i32:$rD, (sextloadi16 iaddr:$src))]>,		[(set i32:$rD, (sextloadi16 DForm:$src))]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def LHZ : DForm_1<40, (outs gprc:$rD), (ins memri:$src),		def LHZ : DForm_1<40, (outs gprc:$rD), (ins memri:$src),
"lhz $rD, $src", IIC_LdStLoad,		"lhz $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (zextloadi16 iaddr:$src))]>;		[(set i32:$rD, (zextloadi16 DForm:$src))]>;
def LWZ : DForm_1<32, (outs gprc:$rD), (ins memri:$src),		def LWZ : DForm_1<32, (outs gprc:$rD), (ins memri:$src),
"lwz $rD, $src", IIC_LdStLoad,		"lwz $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (load iaddr:$src))]>;		[(set i32:$rD, (load DForm:$src))]>;

let Predicates = [HasFPU] in {		let Predicates = [HasFPU] in {
def LFS : DForm_1<48, (outs f4rc:$rD), (ins memri:$src),		def LFS : DForm_1<48, (outs f4rc:$rD), (ins memri:$src),
"lfs $rD, $src", IIC_LdStLFD,		"lfs $rD, $src", IIC_LdStLFD,
[(set f32:$rD, (load iaddr:$src))]>;		[(set f32:$rD, (load DForm:$src))]>;
def LFD : DForm_1<50, (outs f8rc:$rD), (ins memri:$src),		def LFD : DForm_1<50, (outs f8rc:$rD), (ins memri:$src),
"lfd $rD, $src", IIC_LdStLFD,		"lfd $rD, $src", IIC_LdStLFD,
[(set f64:$rD, (load iaddr:$src))]>;		[(set f64:$rD, (load DForm:$src))]>;
}		}


// Unindexed (r+i) Loads with Update (preinc).		// Unindexed (r+i) Loads with Update (preinc).
let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in {		let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in {
def LBZU : DForm_1<35, (outs gprc:$rD, ptr_rc_nor0:$ea_result), (ins memri:$addr),		def LBZU : DForm_1<35, (outs gprc:$rD, ptr_rc_nor0:$ea_result), (ins memri:$addr),
"lbzu $rD, $addr", IIC_LdStLoadUpd,		"lbzu $rD, $addr", IIC_LdStLoadUpd,
[]>, RegConstraint<"$addr.reg = $ea_result">,		[]>, RegConstraint<"$addr.reg = $ea_result">,
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
}		}
}		}

// Indexed (r+r) Loads.		// Indexed (r+r) Loads.
//		//
let PPC970_Unit = 2, mayLoad = 1, mayStore = 0 in {		let PPC970_Unit = 2, mayLoad = 1, mayStore = 0 in {
def LBZX : XForm_1_memOp<31, 87, (outs gprc:$rD), (ins memrr:$src),		def LBZX : XForm_1_memOp<31, 87, (outs gprc:$rD), (ins memrr:$src),
"lbzx $rD, $src", IIC_LdStLoad,		"lbzx $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (zextloadi8 xaddr:$src))]>;		[(set i32:$rD, (zextloadi8 XForm:$src))]>;
def LHAX : XForm_1_memOp<31, 343, (outs gprc:$rD), (ins memrr:$src),		def LHAX : XForm_1_memOp<31, 343, (outs gprc:$rD), (ins memrr:$src),
"lhax $rD, $src", IIC_LdStLHA,		"lhax $rD, $src", IIC_LdStLHA,
[(set i32:$rD, (sextloadi16 xaddr:$src))]>,		[(set i32:$rD, (sextloadi16 XForm:$src))]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def LHZX : XForm_1_memOp<31, 279, (outs gprc:$rD), (ins memrr:$src),		def LHZX : XForm_1_memOp<31, 279, (outs gprc:$rD), (ins memrr:$src),
"lhzx $rD, $src", IIC_LdStLoad,		"lhzx $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (zextloadi16 xaddr:$src))]>;		[(set i32:$rD, (zextloadi16 XForm:$src))]>;
def LWZX : XForm_1_memOp<31, 23, (outs gprc:$rD), (ins memrr:$src),		def LWZX : XForm_1_memOp<31, 23, (outs gprc:$rD), (ins memrr:$src),
"lwzx $rD, $src", IIC_LdStLoad,		"lwzx $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (load xaddr:$src))]>;		[(set i32:$rD, (load XForm:$src))]>;
def LHBRX : XForm_1_memOp<31, 790, (outs gprc:$rD), (ins memrr:$src),		def LHBRX : XForm_1_memOp<31, 790, (outs gprc:$rD), (ins memrr:$src),
"lhbrx $rD, $src", IIC_LdStLoad,		"lhbrx $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (PPClbrx xoaddr:$src, i16))]>;		[(set i32:$rD, (PPClbrx xoaddr:$src, i16))]>;
def LWBRX : XForm_1_memOp<31, 534, (outs gprc:$rD), (ins memrr:$src),		def LWBRX : XForm_1_memOp<31, 534, (outs gprc:$rD), (ins memrr:$src),
"lwbrx $rD, $src", IIC_LdStLoad,		"lwbrx $rD, $src", IIC_LdStLoad,
[(set i32:$rD, (PPClbrx xoaddr:$src, i32))]>;		[(set i32:$rD, (PPClbrx xoaddr:$src, i32))]>;

let Predicates = [HasFPU] in {		let Predicates = [HasFPU] in {
def LFSX : XForm_25_memOp<31, 535, (outs f4rc:$frD), (ins memrr:$src),		def LFSX : XForm_25_memOp<31, 535, (outs f4rc:$frD), (ins memrr:$src),
"lfsx $frD, $src", IIC_LdStLFD,		"lfsx $frD, $src", IIC_LdStLFD,
[(set f32:$frD, (load xaddr:$src))]>;		[(set f32:$frD, (load XForm:$src))]>;
def LFDX : XForm_25_memOp<31, 599, (outs f8rc:$frD), (ins memrr:$src),		def LFDX : XForm_25_memOp<31, 599, (outs f8rc:$frD), (ins memrr:$src),
"lfdx $frD, $src", IIC_LdStLFD,		"lfdx $frD, $src", IIC_LdStLFD,
[(set f64:$frD, (load xaddr:$src))]>;		[(set f64:$frD, (load XForm:$src))]>;

def LFIWAX : XForm_25_memOp<31, 855, (outs f8rc:$frD), (ins memrr:$src),		def LFIWAX : XForm_25_memOp<31, 855, (outs f8rc:$frD), (ins memrr:$src),
"lfiwax $frD, $src", IIC_LdStLFD,		"lfiwax $frD, $src", IIC_LdStLFD,
[(set f64:$frD, (PPClfiwax xoaddr:$src))]>;		[(set f64:$frD, (PPClfiwax xoaddr:$src))]>;
def LFIWZX : XForm_25_memOp<31, 887, (outs f8rc:$frD), (ins memrr:$src),		def LFIWZX : XForm_25_memOp<31, 887, (outs f8rc:$frD), (ins memrr:$src),
"lfiwzx $frD, $src", IIC_LdStLFD,		"lfiwzx $frD, $src", IIC_LdStLFD,
[(set f64:$frD, (PPClfiwzx xoaddr:$src))]>;		[(set f64:$frD, (PPClfiwzx xoaddr:$src))]>;
}		}
}		}

// Load Multiple		// Load Multiple
let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in		let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in
def LMW : DForm_1<46, (outs gprc:$rD), (ins memri:$src),		def LMW : DForm_1<46, (outs gprc:$rD), (ins memri:$src),
"lmw $rD, $src", IIC_LdStLMW, []>;		"lmw $rD, $src", IIC_LdStLMW, []>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PPC32 Store Instructions.		// PPC32 Store Instructions.
//		//

// Unindexed (r+i) Stores.		// Unindexed (r+i) Stores.
let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {		let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {
def STB : DForm_1<38, (outs), (ins gprc:$rS, memri:$dst),		def STB : DForm_1<38, (outs), (ins gprc:$rS, memri:$dst),
"stb $rS, $dst", IIC_LdStStore,		"stb $rS, $dst", IIC_LdStStore,
[(truncstorei8 i32:$rS, iaddr:$dst)]>;		[(truncstorei8 i32:$rS, DForm:$dst)]>;
def STH : DForm_1<44, (outs), (ins gprc:$rS, memri:$dst),		def STH : DForm_1<44, (outs), (ins gprc:$rS, memri:$dst),
"sth $rS, $dst", IIC_LdStStore,		"sth $rS, $dst", IIC_LdStStore,
[(truncstorei16 i32:$rS, iaddr:$dst)]>;		[(truncstorei16 i32:$rS, DForm:$dst)]>;
def STW : DForm_1<36, (outs), (ins gprc:$rS, memri:$dst),		def STW : DForm_1<36, (outs), (ins gprc:$rS, memri:$dst),
"stw $rS, $dst", IIC_LdStStore,		"stw $rS, $dst", IIC_LdStStore,
[(store i32:$rS, iaddr:$dst)]>;		[(store i32:$rS, DForm:$dst)]>;
let Predicates = [HasFPU] in {		let Predicates = [HasFPU] in {
def STFS : DForm_1<52, (outs), (ins f4rc:$rS, memri:$dst),		def STFS : DForm_1<52, (outs), (ins f4rc:$rS, memri:$dst),
"stfs $rS, $dst", IIC_LdStSTFD,		"stfs $rS, $dst", IIC_LdStSTFD,
[(store f32:$rS, iaddr:$dst)]>;		[(store f32:$rS, DForm:$dst)]>;
def STFD : DForm_1<54, (outs), (ins f8rc:$rS, memri:$dst),		def STFD : DForm_1<54, (outs), (ins f8rc:$rS, memri:$dst),
"stfd $rS, $dst", IIC_LdStSTFD,		"stfd $rS, $dst", IIC_LdStSTFD,
[(store f64:$rS, iaddr:$dst)]>;		[(store f64:$rS, DForm:$dst)]>;
}		}
}		}

// Unindexed (r+i) Stores with Update (preinc).		// Unindexed (r+i) Stores with Update (preinc).
let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {		let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {
def STBU : DForm_1<39, (outs ptr_rc_nor0:$ea_res), (ins gprc:$rS, memri:$dst),		def STBU : DForm_1<39, (outs ptr_rc_nor0:$ea_res), (ins gprc:$rS, memri:$dst),
"stbu $rS, $dst", IIC_LdStSTU, []>,		"stbu $rS, $dst", IIC_LdStSTU, []>,
RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;		RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
Show All 26 Lines	def : Pat<(pre_store f32:$rS, iPTR:$ptrreg, iaddroff:$ptroff),
(STFSU $rS, iaddroff:$ptroff, $ptrreg)>;		(STFSU $rS, iaddroff:$ptroff, $ptrreg)>;
def : Pat<(pre_store f64:$rS, iPTR:$ptrreg, iaddroff:$ptroff),		def : Pat<(pre_store f64:$rS, iPTR:$ptrreg, iaddroff:$ptroff),
(STFDU $rS, iaddroff:$ptroff, $ptrreg)>;		(STFDU $rS, iaddroff:$ptroff, $ptrreg)>;

// Indexed (r+r) Stores.		// Indexed (r+r) Stores.
let PPC970_Unit = 2 in {		let PPC970_Unit = 2 in {
def STBX : XForm_8_memOp<31, 215, (outs), (ins gprc:$rS, memrr:$dst),		def STBX : XForm_8_memOp<31, 215, (outs), (ins gprc:$rS, memrr:$dst),
"stbx $rS, $dst", IIC_LdStStore,		"stbx $rS, $dst", IIC_LdStStore,
[(truncstorei8 i32:$rS, xaddr:$dst)]>,		[(truncstorei8 i32:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STHX : XForm_8_memOp<31, 407, (outs), (ins gprc:$rS, memrr:$dst),		def STHX : XForm_8_memOp<31, 407, (outs), (ins gprc:$rS, memrr:$dst),
"sthx $rS, $dst", IIC_LdStStore,		"sthx $rS, $dst", IIC_LdStStore,
[(truncstorei16 i32:$rS, xaddr:$dst)]>,		[(truncstorei16 i32:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STWX : XForm_8_memOp<31, 151, (outs), (ins gprc:$rS, memrr:$dst),		def STWX : XForm_8_memOp<31, 151, (outs), (ins gprc:$rS, memrr:$dst),
"stwx $rS, $dst", IIC_LdStStore,		"stwx $rS, $dst", IIC_LdStStore,
[(store i32:$rS, xaddr:$dst)]>,		[(store i32:$rS, XForm:$dst)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;

def STHBRX: XForm_8_memOp<31, 918, (outs), (ins gprc:$rS, memrr:$dst),		def STHBRX: XForm_8_memOp<31, 918, (outs), (ins gprc:$rS, memrr:$dst),
"sthbrx $rS, $dst", IIC_LdStStore,		"sthbrx $rS, $dst", IIC_LdStStore,
[(PPCstbrx i32:$rS, xoaddr:$dst, i16)]>,		[(PPCstbrx i32:$rS, xoaddr:$dst, i16)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;
def STWBRX: XForm_8_memOp<31, 662, (outs), (ins gprc:$rS, memrr:$dst),		def STWBRX: XForm_8_memOp<31, 662, (outs), (ins gprc:$rS, memrr:$dst),
"stwbrx $rS, $dst", IIC_LdStStore,		"stwbrx $rS, $dst", IIC_LdStStore,
[(PPCstbrx i32:$rS, xoaddr:$dst, i32)]>,		[(PPCstbrx i32:$rS, xoaddr:$dst, i32)]>,
PPC970_DGroup_Cracked;		PPC970_DGroup_Cracked;

let Predicates = [HasFPU] in {		let Predicates = [HasFPU] in {
def STFIWX: XForm_28_memOp<31, 983, (outs), (ins f8rc:$frS, memrr:$dst),		def STFIWX: XForm_28_memOp<31, 983, (outs), (ins f8rc:$frS, memrr:$dst),
"stfiwx $frS, $dst", IIC_LdStSTFD,		"stfiwx $frS, $dst", IIC_LdStSTFD,
[(PPCstfiwx f64:$frS, xoaddr:$dst)]>;		[(PPCstfiwx f64:$frS, xoaddr:$dst)]>;

def STFSX : XForm_28_memOp<31, 663, (outs), (ins f4rc:$frS, memrr:$dst),		def STFSX : XForm_28_memOp<31, 663, (outs), (ins f4rc:$frS, memrr:$dst),
"stfsx $frS, $dst", IIC_LdStSTFD,		"stfsx $frS, $dst", IIC_LdStSTFD,
[(store f32:$frS, xaddr:$dst)]>;		[(store f32:$frS, XForm:$dst)]>;
def STFDX : XForm_28_memOp<31, 727, (outs), (ins f8rc:$frS, memrr:$dst),		def STFDX : XForm_28_memOp<31, 727, (outs), (ins f8rc:$frS, memrr:$dst),
"stfdx $frS, $dst", IIC_LdStSTFD,		"stfdx $frS, $dst", IIC_LdStSTFD,
[(store f64:$frS, xaddr:$dst)]>;		[(store f64:$frS, XForm:$dst)]>;
}		}
}		}

// Indexed (r+r) Stores with Update (preinc).		// Indexed (r+r) Stores with Update (preinc).
let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {		let PPC970_Unit = 2, mayStore = 1, mayLoad = 0 in {
def STBUX : XForm_8_memOp<31, 247, (outs ptr_rc_nor0:$ea_res),		def STBUX : XForm_8_memOp<31, 247, (outs ptr_rc_nor0:$ea_res),
(ins gprc:$rS, memrr:$dst),		(ins gprc:$rS, memrr:$dst),
"stbux $rS, $dst", IIC_LdStSTUX, []>,		"stbux $rS, $dst", IIC_LdStSTUX, []>,
▲ Show 20 Lines • Show All 1,083 Lines • ▼ Show 20 Lines
// amounts.		// amounts.
def : Pat<(sra i32:$rS, i32:$rB),		def : Pat<(sra i32:$rS, i32:$rB),
(SRAW $rS, $rB)>;		(SRAW $rS, $rB)>;
def : Pat<(srl i32:$rS, i32:$rB),		def : Pat<(srl i32:$rS, i32:$rB),
(SRW $rS, $rB)>;		(SRW $rS, $rB)>;
def : Pat<(shl i32:$rS, i32:$rB),		def : Pat<(shl i32:$rS, i32:$rB),
(SLW $rS, $rB)>;		(SLW $rS, $rB)>;

def : Pat<(i32 (zextloadi1 iaddr:$src)),		def : Pat<(i32 (zextloadi1 DForm:$src)),
(LBZ iaddr:$src)>;		(LBZ DForm:$src)>;
def : Pat<(i32 (zextloadi1 xaddr:$src)),		def : Pat<(i32 (zextloadi1 XForm:$src)),
(LBZX xaddr:$src)>;		(LBZX XForm:$src)>;
def : Pat<(i32 (extloadi1 iaddr:$src)),		def : Pat<(i32 (extloadi1 DForm:$src)),
(LBZ iaddr:$src)>;		(LBZ DForm:$src)>;
def : Pat<(i32 (extloadi1 xaddr:$src)),		def : Pat<(i32 (extloadi1 XForm:$src)),
(LBZX xaddr:$src)>;		(LBZX XForm:$src)>;
def : Pat<(i32 (extloadi8 iaddr:$src)),		def : Pat<(i32 (extloadi8 DForm:$src)),
(LBZ iaddr:$src)>;		(LBZ DForm:$src)>;
def : Pat<(i32 (extloadi8 xaddr:$src)),		def : Pat<(i32 (extloadi8 XForm:$src)),
(LBZX xaddr:$src)>;		(LBZX XForm:$src)>;
def : Pat<(i32 (extloadi16 iaddr:$src)),		def : Pat<(i32 (extloadi16 DForm:$src)),
(LHZ iaddr:$src)>;		(LHZ DForm:$src)>;
def : Pat<(i32 (extloadi16 xaddr:$src)),		def : Pat<(i32 (extloadi16 XForm:$src)),
(LHZX xaddr:$src)>;		(LHZX XForm:$src)>;
let Predicates = [HasFPU] in {		let Predicates = [HasFPU] in {
def : Pat<(f64 (extloadf32 iaddr:$src)),		def : Pat<(f64 (extloadf32 DForm:$src)),
(COPY_TO_REGCLASS (LFS iaddr:$src), F8RC)>;		(COPY_TO_REGCLASS (LFS DForm:$src), F8RC)>;
def : Pat<(f64 (extloadf32 xaddr:$src)),		def : Pat<(f64 (extloadf32 XForm:$src)),
(COPY_TO_REGCLASS (LFSX xaddr:$src), F8RC)>;		(COPY_TO_REGCLASS (LFSX XForm:$src), F8RC)>;

def : Pat<(f64 (any_fpextend f32:$src)),		def : Pat<(f64 (any_fpextend f32:$src)),
(COPY_TO_REGCLASS $src, F8RC)>;		(COPY_TO_REGCLASS $src, F8RC)>;
}		}

// Only seq_cst fences require the heavyweight sync (SYNC 0).		// Only seq_cst fences require the heavyweight sync (SYNC 0).
// All others can use the lightweight sync (SYNC 1).		// All others can use the lightweight sync (SYNC 1).
// source: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html		// source: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
▲ Show 20 Lines • Show All 1,797 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrPrefix.td

Show First 20 Lines • Show All 2,533 Lines • ▼ Show 20 Lines	def : Pat<(v4i32 (int_ppc_vsx_xxgenpcvwm v4i32:$VRB, imm:$IMM)),
(v4i32 (COPY_TO_REGCLASS (XXGENPCVWM $VRB, imm:$IMM), VRRC))>;		(v4i32 (COPY_TO_REGCLASS (XXGENPCVWM $VRB, imm:$IMM), VRRC))>;
def : Pat<(v2i64 (int_ppc_vsx_xxgenpcvdm v2i64:$VRB, imm:$IMM)),		def : Pat<(v2i64 (int_ppc_vsx_xxgenpcvdm v2i64:$VRB, imm:$IMM)),
(v2i64 (COPY_TO_REGCLASS (XXGENPCVDM $VRB, imm:$IMM), VRRC))>;		(v2i64 (COPY_TO_REGCLASS (XXGENPCVDM $VRB, imm:$IMM), VRRC))>;
def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, 1)),		def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, 1)),
(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_lt)>;		(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_lt)>;
def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, 0)),		def : Pat<(i32 (int_ppc_vsx_xvtlsbb v16i8:$XB, 0)),
(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_eq)>;		(EXTRACT_SUBREG (XVTLSBB (COPY_TO_REGCLASS $XB, VSRC)), sub_eq)>;

def : Pat <(v1i128 (PPClxvrzx xoaddr:$src, 8)),		def : Pat <(v1i128 (PPClxvrzx ForceXForm:$src, 8)),
(v1i128 (COPY_TO_REGCLASS (LXVRBX xoaddr:$src), VRRC))>;		(v1i128 (COPY_TO_REGCLASS (LXVRBX ForceXForm:$src), VRRC))>;
def : Pat <(v1i128 (PPClxvrzx xoaddr:$src, 16)),		def : Pat <(v1i128 (PPClxvrzx ForceXForm:$src, 16)),
(v1i128 (COPY_TO_REGCLASS (LXVRHX xoaddr:$src), VRRC))>;		(v1i128 (COPY_TO_REGCLASS (LXVRHX ForceXForm:$src), VRRC))>;
def : Pat <(v1i128 (PPClxvrzx xoaddr:$src, 32)),		def : Pat <(v1i128 (PPClxvrzx ForceXForm:$src, 32)),
(v1i128 (COPY_TO_REGCLASS (LXVRWX xoaddr:$src), VRRC))>;		(v1i128 (COPY_TO_REGCLASS (LXVRWX ForceXForm:$src), VRRC))>;
def : Pat <(v1i128 (PPClxvrzx xoaddr:$src, 64)),		def : Pat <(v1i128 (PPClxvrzx ForceXForm:$src, 64)),
(v1i128 (COPY_TO_REGCLASS (LXVRDX xoaddr:$src), VRRC))>;		(v1i128 (COPY_TO_REGCLASS (LXVRDX ForceXForm:$src), VRRC))>;

def : Pat<(v1i128 (rotl v1i128:$vA, v1i128:$vB)),		def : Pat<(v1i128 (rotl v1i128:$vA, v1i128:$vB)),
(v1i128 (VRLQ v1i128:$vA, v1i128:$vB))>;		(v1i128 (VRLQ v1i128:$vA, v1i128:$vB))>;

def : Pat <(v2i64 (PPCxxsplti32dx v2i64:$XT, i32:$XI, i32:$IMM32)),		def : Pat <(v2i64 (PPCxxsplti32dx v2i64:$XT, i32:$XI, i32:$IMM32)),
(v2i64 (XXSPLTI32DX v2i64:$XT, i32:$XI, i32:$IMM32))>;		(v2i64 (XXSPLTI32DX v2i64:$XT, i32:$XI, i32:$IMM32))>;
}		}

let Predicates = [IsISA3_1, HasVSX] in {		let Predicates = [IsISA3_1, HasVSX] in {
def : Pat<(v16i8 (int_ppc_vsx_xvcvspbf16 v16i8:$XA)),		def : Pat<(v16i8 (int_ppc_vsx_xvcvspbf16 v16i8:$XA)),
(COPY_TO_REGCLASS (XVCVSPBF16 RCCp.AToVSRC), VRRC)>;		(COPY_TO_REGCLASS (XVCVSPBF16 RCCp.AToVSRC), VRRC)>;
def : Pat<(v16i8 (int_ppc_vsx_xvcvbf16spn v16i8:$XA)),		def : Pat<(v16i8 (int_ppc_vsx_xvcvbf16spn v16i8:$XA)),
(COPY_TO_REGCLASS (XVCVBF16SPN RCCp.AToVSRC), VRRC)>;		(COPY_TO_REGCLASS (XVCVBF16SPN RCCp.AToVSRC), VRRC)>;
}		}

let AddedComplexity = 400, Predicates = [IsISA3_1, IsLittleEndian] in {		let AddedComplexity = 400, Predicates = [IsISA3_1, IsLittleEndian] in {
// Store element 0 of a VSX register to memory		// Store element 0 of a VSX register to memory
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$src, 0)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$src, 0)), ForceXForm:$dst),
(STXVRBX (COPY_TO_REGCLASS v16i8:$src, VSRC), xoaddr:$dst)>;		(STXVRBX (COPY_TO_REGCLASS v16i8:$src, VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$src, 0)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$src, 0)), ForceXForm:$dst),
(STXVRHX (COPY_TO_REGCLASS v8i16:$src, VSRC), xoaddr:$dst)>;		(STXVRHX (COPY_TO_REGCLASS v8i16:$src, VSRC), ForceXForm:$dst)>;
def : Pat<(store (i32 (extractelt v4i32:$src, 0)), xoaddr:$dst),		def : Pat<(store (i32 (extractelt v4i32:$src, 0)), ForceXForm:$dst),
(STXVRWX $src, xoaddr:$dst)>;		(STXVRWX $src, ForceXForm:$dst)>;
def : Pat<(store (f32 (extractelt v4f32:$src, 0)), xoaddr:$dst),		def : Pat<(store (f32 (extractelt v4f32:$src, 0)), ForceXForm:$dst),
(STXVRWX $src, xoaddr:$dst)>;		(STXVRWX $src, ForceXForm:$dst)>;
def : Pat<(store (i64 (extractelt v2i64:$src, 0)), xoaddr:$dst),		def : Pat<(store (i64 (extractelt v2i64:$src, 0)), ForceXForm:$dst),
(STXVRDX $src, xoaddr:$dst)>;		(STXVRDX $src, ForceXForm:$dst)>;
def : Pat<(store (f64 (extractelt v2f64:$src, 0)), xoaddr:$dst),		def : Pat<(store (f64 (extractelt v2f64:$src, 0)), ForceXForm:$dst),
(STXVRDX $src, xoaddr:$dst)>;		(STXVRDX $src, ForceXForm:$dst)>;
// Load element 0 of a VSX register to memory		// Load element 0 of a VSX register to memory
def : Pat<(v8i16 (scalar_to_vector (i32 (extloadi16 xoaddr:$src)))),		def : Pat<(v8i16 (scalar_to_vector (i32 (extloadi16 ForceXForm:$src)))),
(v8i16 (COPY_TO_REGCLASS (LXVRHX xoaddr:$src), VSRC))>;		(v8i16 (COPY_TO_REGCLASS (LXVRHX ForceXForm:$src), VSRC))>;
def : Pat<(v16i8 (scalar_to_vector (i32 (extloadi8 xoaddr:$src)))),		def : Pat<(v16i8 (scalar_to_vector (i32 (extloadi8 ForceXForm:$src)))),
(v16i8 (COPY_TO_REGCLASS (LXVRBX xoaddr:$src), VSRC))>;		(v16i8 (COPY_TO_REGCLASS (LXVRBX ForceXForm:$src), VSRC))>;
}		}

// FIXME: The swap is overkill when the shift amount is a constant.		// FIXME: The swap is overkill when the shift amount is a constant.
// We should just fix the constant in the DAG.		// We should just fix the constant in the DAG.
let AddedComplexity = 400, Predicates = [IsISA3_1, HasVSX] in {		let AddedComplexity = 400, Predicates = [IsISA3_1, HasVSX] in {
def : Pat<(v1i128 (shl v1i128:$VRA, v1i128:$VRB)),		def : Pat<(v1i128 (shl v1i128:$VRA, v1i128:$VRB)),
(v1i128 (VSLQ v1i128:$VRA,		(v1i128 (VSLQ v1i128:$VRA,
(XXPERMDI (COPY_TO_REGCLASS $VRB, VSRC),		(XXPERMDI (COPY_TO_REGCLASS $VRB, VSRC),
▲ Show 20 Lines • Show All 207 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 310 Lines • ▼ Show 20 Lines	def LXSDX : XX1Form_memOp<31, 588,
(outs vsfrc:$XT), (ins memrr:$src),		(outs vsfrc:$XT), (ins memrr:$src),
"lxsdx $XT, $src", IIC_LdStLFD,		"lxsdx $XT, $src", IIC_LdStLFD,
[]>;		[]>;

// Pseudo instruction XFLOADf64 will be expanded to LXSDX or LFDX later		// Pseudo instruction XFLOADf64 will be expanded to LXSDX or LFDX later
let CodeSize = 3 in		let CodeSize = 3 in
def XFLOADf64 : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),		def XFLOADf64 : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),
"#XFLOADf64",		"#XFLOADf64",
[(set f64:$XT, (load xoaddr:$src))]>;		[(set f64:$XT, (load ForceXForm:$src))]>;

let Predicates = [HasVSX, HasOnlySwappingMemOps] in		let Predicates = [HasVSX, HasOnlySwappingMemOps] in
def LXVD2X : XX1Form_memOp<31, 844,		def LXVD2X : XX1Form_memOp<31, 844,
(outs vsrc:$XT), (ins memrr:$src),		(outs vsrc:$XT), (ins memrr:$src),
"lxvd2x $XT, $src", IIC_LdStLFD,		"lxvd2x $XT, $src", IIC_LdStLFD,
[(set v2f64:$XT, (int_ppc_vsx_lxvd2x xoaddr:$src))]>;		[(set v2f64:$XT, (int_ppc_vsx_lxvd2x ForceXForm:$src))]>;

def LXVDSX : XX1Form_memOp<31, 332,		def LXVDSX : XX1Form_memOp<31, 332,
(outs vsrc:$XT), (ins memrr:$src),		(outs vsrc:$XT), (ins memrr:$src),
"lxvdsx $XT, $src", IIC_LdStLFD, []>;		"lxvdsx $XT, $src", IIC_LdStLFD, []>;

let Predicates = [HasVSX, HasOnlySwappingMemOps] in		let Predicates = [HasVSX, HasOnlySwappingMemOps] in
def LXVW4X : XX1Form_memOp<31, 780,		def LXVW4X : XX1Form_memOp<31, 780,
(outs vsrc:$XT), (ins memrr:$src),		(outs vsrc:$XT), (ins memrr:$src),
"lxvw4x $XT, $src", IIC_LdStLFD,		"lxvw4x $XT, $src", IIC_LdStLFD,
[]>;		[]>;
} // mayLoad		} // mayLoad

// Store indexed instructions		// Store indexed instructions
let mayStore = 1, mayLoad = 0 in {		let mayStore = 1, mayLoad = 0 in {
let CodeSize = 3 in		let CodeSize = 3 in
def STXSDX : XX1Form_memOp<31, 716,		def STXSDX : XX1Form_memOp<31, 716,
(outs), (ins vsfrc:$XT, memrr:$dst),		(outs), (ins vsfrc:$XT, memrr:$dst),
"stxsdx $XT, $dst", IIC_LdStSTFD,		"stxsdx $XT, $dst", IIC_LdStSTFD,
[]>;		[]>;

// Pseudo instruction XFSTOREf64 will be expanded to STXSDX or STFDX later		// Pseudo instruction XFSTOREf64 will be expanded to STXSDX or STFDX later
let CodeSize = 3 in		let CodeSize = 3 in
def XFSTOREf64 : PseudoXFormMemOp<(outs), (ins vsfrc:$XT, memrr:$dst),		def XFSTOREf64 : PseudoXFormMemOp<(outs), (ins vsfrc:$XT, memrr:$dst),
"#XFSTOREf64",		"#XFSTOREf64",
[(store f64:$XT, xoaddr:$dst)]>;		[(store f64:$XT, ForceXForm:$dst)]>;

let Predicates = [HasVSX, HasOnlySwappingMemOps] in {		let Predicates = [HasVSX, HasOnlySwappingMemOps] in {
// The behaviour of this instruction is endianness-specific so we provide no		// The behaviour of this instruction is endianness-specific so we provide no
// pattern to match it without considering endianness.		// pattern to match it without considering endianness.
def STXVD2X : XX1Form_memOp<31, 972,		def STXVD2X : XX1Form_memOp<31, 972,
(outs), (ins vsrc:$XT, memrr:$dst),		(outs), (ins vsrc:$XT, memrr:$dst),
"stxvd2x $XT, $dst", IIC_LdStSTFD,		"stxvd2x $XT, $dst", IIC_LdStSTFD,
[]>;		[]>;
▲ Show 20 Lines • Show All 764 Lines • ▼ Show 20 Lines	def LXSIWAX : XX1Form_memOp<31, 76, (outs vsfrc:$XT), (ins memrr:$src),
"lxsiwax $XT, $src", IIC_LdStLFD, []>;		"lxsiwax $XT, $src", IIC_LdStLFD, []>;
def LXSIWZX : XX1Form_memOp<31, 12, (outs vsfrc:$XT), (ins memrr:$src),		def LXSIWZX : XX1Form_memOp<31, 12, (outs vsfrc:$XT), (ins memrr:$src),
"lxsiwzx $XT, $src", IIC_LdStLFD, []>;		"lxsiwzx $XT, $src", IIC_LdStLFD, []>;

// Pseudo instruction XFLOADf32 will be expanded to LXSSPX or LFSX later		// Pseudo instruction XFLOADf32 will be expanded to LXSSPX or LFSX later
let CodeSize = 3 in		let CodeSize = 3 in
def XFLOADf32 : PseudoXFormMemOp<(outs vssrc:$XT), (ins memrr:$src),		def XFLOADf32 : PseudoXFormMemOp<(outs vssrc:$XT), (ins memrr:$src),
"#XFLOADf32",		"#XFLOADf32",
[(set f32:$XT, (load xoaddr:$src))]>;		[(set f32:$XT, (load ForceXForm:$src))]>;
// Pseudo instruction LIWAX will be expanded to LXSIWAX or LFIWAX later		// Pseudo instruction LIWAX will be expanded to LXSIWAX or LFIWAX later
def LIWAX : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),		def LIWAX : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),
"#LIWAX",		"#LIWAX",
[(set f64:$XT, (PPClfiwax xoaddr:$src))]>;		[(set f64:$XT, (PPClfiwax ForceXForm:$src))]>;
// Pseudo instruction LIWZX will be expanded to LXSIWZX or LFIWZX later		// Pseudo instruction LIWZX will be expanded to LXSIWZX or LFIWZX later
def LIWZX : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),		def LIWZX : PseudoXFormMemOp<(outs vsfrc:$XT), (ins memrr:$src),
"#LIWZX",		"#LIWZX",
[(set f64:$XT, (PPClfiwzx xoaddr:$src))]>;		[(set f64:$XT, (PPClfiwzx ForceXForm:$src))]>;
} // mayLoad		} // mayLoad

// VSX scalar stores introduced in ISA 2.07		// VSX scalar stores introduced in ISA 2.07
let mayStore = 1, mayLoad = 0 in {		let mayStore = 1, mayLoad = 0 in {
let CodeSize = 3 in		let CodeSize = 3 in
def STXSSPX : XX1Form_memOp<31, 652, (outs), (ins vssrc:$XT, memrr:$dst),		def STXSSPX : XX1Form_memOp<31, 652, (outs), (ins vssrc:$XT, memrr:$dst),
"stxsspx $XT, $dst", IIC_LdStSTFD, []>;		"stxsspx $XT, $dst", IIC_LdStSTFD, []>;
def STXSIWX : XX1Form_memOp<31, 140, (outs), (ins vsfrc:$XT, memrr:$dst),		def STXSIWX : XX1Form_memOp<31, 140, (outs), (ins vsfrc:$XT, memrr:$dst),
"stxsiwx $XT, $dst", IIC_LdStSTFD, []>;		"stxsiwx $XT, $dst", IIC_LdStSTFD, []>;

// Pseudo instruction XFSTOREf32 will be expanded to STXSSPX or STFSX later		// Pseudo instruction XFSTOREf32 will be expanded to STXSSPX or STFSX later
let CodeSize = 3 in		let CodeSize = 3 in
def XFSTOREf32 : PseudoXFormMemOp<(outs), (ins vssrc:$XT, memrr:$dst),		def XFSTOREf32 : PseudoXFormMemOp<(outs), (ins vssrc:$XT, memrr:$dst),
"#XFSTOREf32",		"#XFSTOREf32",
[(store f32:$XT, xoaddr:$dst)]>;		[(store f32:$XT, ForceXForm:$dst)]>;
// Pseudo instruction STIWX will be expanded to STXSIWX or STFIWX later		// Pseudo instruction STIWX will be expanded to STXSIWX or STFIWX later
def STIWX : PseudoXFormMemOp<(outs), (ins vsfrc:$XT, memrr:$dst),		def STIWX : PseudoXFormMemOp<(outs), (ins vsfrc:$XT, memrr:$dst),
"#STIWX",		"#STIWX",
[(PPCstfiwx f64:$XT, xoaddr:$dst)]>;		[(PPCstfiwx f64:$XT, ForceXForm:$dst)]>;
} // mayStore		} // mayStore

// VSX Elementary Scalar FP arithmetic (SP)		// VSX Elementary Scalar FP arithmetic (SP)
let mayRaiseFPException = 1 in {		let mayRaiseFPException = 1 in {
let isCommutable = 1 in {		let isCommutable = 1 in {
def XSADDSP : XX3Form<60, 0,		def XSADDSP : XX3Form<60, 0,
(outs vssrc:$XT), (ins vssrc:$XA, vssrc:$XB),		(outs vssrc:$XT), (ins vssrc:$XA, vssrc:$XB),
"xsaddsp $XT, $XA, $XB", IIC_VecFP,		"xsaddsp $XT, $XA, $XB", IIC_VecFP,
▲ Show 20 Lines • Show All 510 Lines • ▼ Show 20 Lines	let Predicates = [HasVSX, HasP9Vector] in {
def LXSD : DSForm_1<57, 2, (outs vfrc:$vD), (ins memrix:$src),		def LXSD : DSForm_1<57, 2, (outs vfrc:$vD), (ins memrix:$src),
"lxsd $vD, $src", IIC_LdStLFD, []>;		"lxsd $vD, $src", IIC_LdStLFD, []>;
// Load SP from src, convert it to DP, and place in dword[0]		// Load SP from src, convert it to DP, and place in dword[0]
def LXSSP : DSForm_1<57, 3, (outs vfrc:$vD), (ins memrix:$src),		def LXSSP : DSForm_1<57, 3, (outs vfrc:$vD), (ins memrix:$src),
"lxssp $vD, $src", IIC_LdStLFD, []>;		"lxssp $vD, $src", IIC_LdStLFD, []>;

// Load as Integer Byte/Halfword & Zero Indexed		// Load as Integer Byte/Halfword & Zero Indexed
def LXSIBZX : X_XT6_RA5_RB5<31, 781, "lxsibzx", vsfrc,		def LXSIBZX : X_XT6_RA5_RB5<31, 781, "lxsibzx", vsfrc,
[(set f64:$XT, (PPClxsizx xoaddr:$src, 1))]>;		[(set f64:$XT, (PPClxsizx ForceXForm:$src, 1))]>;
def LXSIHZX : X_XT6_RA5_RB5<31, 813, "lxsihzx", vsfrc,		def LXSIHZX : X_XT6_RA5_RB5<31, 813, "lxsihzx", vsfrc,
[(set f64:$XT, (PPClxsizx xoaddr:$src, 2))]>;		[(set f64:$XT, (PPClxsizx ForceXForm:$src, 2))]>;

// Load Vector Halfword8/Byte16 Indexed		// Load Vector Halfword8/Byte16 Indexed
def LXVH8X : X_XT6_RA5_RB5<31, 812, "lxvh8x" , vsrc, []>;		def LXVH8X : X_XT6_RA5_RB5<31, 812, "lxvh8x" , vsrc, []>;
def LXVB16X : X_XT6_RA5_RB5<31, 876, "lxvb16x", vsrc, []>;		def LXVB16X : X_XT6_RA5_RB5<31, 876, "lxvb16x", vsrc, []>;

// Load Vector Indexed		// Load Vector Indexed
def LXVX : X_XT6_RA5_RB5<31, 268, "lxvx" , vsrc,		def LXVX : X_XT6_RA5_RB5<31, 268, "lxvx" , vsrc,
[(set v2f64:$XT, (load xaddrX16:$src))]>;		[(set v2f64:$XT, (load XForm:$src))]>;
// Load Vector (Left-justified) with Length		// Load Vector (Left-justified) with Length
def LXVL : XX1Form_memOp<31, 269, (outs vsrc:$XT), (ins memr:$src, g8rc:$rB),		def LXVL : XX1Form_memOp<31, 269, (outs vsrc:$XT), (ins memr:$src, g8rc:$rB),
"lxvl $XT, $src, $rB", IIC_LdStLoad,		"lxvl $XT, $src, $rB", IIC_LdStLoad,
[(set v4i32:$XT, (int_ppc_vsx_lxvl addr:$src, i64:$rB))]>;		[(set v4i32:$XT, (int_ppc_vsx_lxvl addr:$src, i64:$rB))]>;
def LXVLL : XX1Form_memOp<31,301, (outs vsrc:$XT), (ins memr:$src, g8rc:$rB),		def LXVLL : XX1Form_memOp<31,301, (outs vsrc:$XT), (ins memr:$src, g8rc:$rB),
"lxvll $XT, $src, $rB", IIC_LdStLoad,		"lxvll $XT, $src, $rB", IIC_LdStLoad,
[(set v4i32:$XT, (int_ppc_vsx_lxvll addr:$src, i64:$rB))]>;		[(set v4i32:$XT, (int_ppc_vsx_lxvll addr:$src, i64:$rB))]>;

Show All 11 Lines	let Predicates = [HasVSX, HasP9Vector] in {
def STXSD : DSForm_1<61, 2, (outs), (ins vfrc:$vS, memrix:$dst),		def STXSD : DSForm_1<61, 2, (outs), (ins vfrc:$vS, memrix:$dst),
"stxsd $vS, $dst", IIC_LdStSTFD, []>;		"stxsd $vS, $dst", IIC_LdStSTFD, []>;
// Convert DP of dword[0] to SP, and Store to dst		// Convert DP of dword[0] to SP, and Store to dst
def STXSSP : DSForm_1<61, 3, (outs), (ins vfrc:$vS, memrix:$dst),		def STXSSP : DSForm_1<61, 3, (outs), (ins vfrc:$vS, memrix:$dst),
"stxssp $vS, $dst", IIC_LdStSTFD, []>;		"stxssp $vS, $dst", IIC_LdStSTFD, []>;

// Store as Integer Byte/Halfword Indexed		// Store as Integer Byte/Halfword Indexed
def STXSIBX : X_XS6_RA5_RB5<31, 909, "stxsibx" , vsfrc,		def STXSIBX : X_XS6_RA5_RB5<31, 909, "stxsibx" , vsfrc,
[(PPCstxsix f64:$XT, xoaddr:$dst, 1)]>;		[(PPCstxsix f64:$XT, ForceXForm:$dst, 1)]>;
def STXSIHX : X_XS6_RA5_RB5<31, 941, "stxsihx" , vsfrc,		def STXSIHX : X_XS6_RA5_RB5<31, 941, "stxsihx" , vsfrc,
[(PPCstxsix f64:$XT, xoaddr:$dst, 2)]>;		[(PPCstxsix f64:$XT, ForceXForm:$dst, 2)]>;
let isCodeGenOnly = 1 in {		let isCodeGenOnly = 1 in {
def STXSIBXv : X_XS6_RA5_RB5<31, 909, "stxsibx" , vsrc, []>;		def STXSIBXv : X_XS6_RA5_RB5<31, 909, "stxsibx" , vsrc, []>;
def STXSIHXv : X_XS6_RA5_RB5<31, 941, "stxsihx" , vsrc, []>;		def STXSIHXv : X_XS6_RA5_RB5<31, 941, "stxsihx" , vsrc, []>;
}		}

// Store Vector Halfword8/Byte16 Indexed		// Store Vector Halfword8/Byte16 Indexed
def STXVH8X : X_XS6_RA5_RB5<31, 940, "stxvh8x" , vsrc, []>;		def STXVH8X : X_XS6_RA5_RB5<31, 940, "stxvh8x" , vsrc, []>;
def STXVB16X : X_XS6_RA5_RB5<31, 1004, "stxvb16x", vsrc, []>;		def STXVB16X : X_XS6_RA5_RB5<31, 1004, "stxvb16x", vsrc, []>;

// Store Vector Indexed		// Store Vector Indexed
def STXVX : X_XS6_RA5_RB5<31, 396, "stxvx" , vsrc,		def STXVX : X_XS6_RA5_RB5<31, 396, "stxvx" , vsrc,
[(store v2f64:$XT, xaddrX16:$dst)]>;		[(store v2f64:$XT, XForm:$dst)]>;

// Store Vector (Left-justified) with Length		// Store Vector (Left-justified) with Length
def STXVL : XX1Form_memOp<31, 397, (outs),		def STXVL : XX1Form_memOp<31, 397, (outs),
(ins vsrc:$XT, memr:$dst, g8rc:$rB),		(ins vsrc:$XT, memr:$dst, g8rc:$rB),
"stxvl $XT, $dst, $rB", IIC_LdStLoad,		"stxvl $XT, $dst, $rB", IIC_LdStLoad,
[(int_ppc_vsx_stxvl v4i32:$XT, addr:$dst,		[(int_ppc_vsx_stxvl v4i32:$XT, addr:$dst,
i64:$rB)]>;		i64:$rB)]>;
def STXVLL : XX1Form_memOp<31, 429, (outs),		def STXVLL : XX1Form_memOp<31, 429, (outs),
(ins vsrc:$XT, memr:$dst, g8rc:$rB),		(ins vsrc:$XT, memr:$dst, g8rc:$rB),
"stxvll $XT, $dst, $rB", IIC_LdStLoad,		"stxvll $XT, $dst, $rB", IIC_LdStLoad,
[(int_ppc_vsx_stxvll v4i32:$XT, addr:$dst,		[(int_ppc_vsx_stxvll v4i32:$XT, addr:$dst,
i64:$rB)]>;		i64:$rB)]>;
} // mayStore		} // mayStore

def DFLOADf32 : PPCPostRAExpPseudo<(outs vssrc:$XT), (ins memrix:$src),		def DFLOADf32 : PPCPostRAExpPseudo<(outs vssrc:$XT), (ins memrix:$src),
"#DFLOADf32",		"#DFLOADf32",
[(set f32:$XT, (load iaddrX4:$src))]>;		[(set f32:$XT, (load DSForm:$src))]>;
def DFLOADf64 : PPCPostRAExpPseudo<(outs vsfrc:$XT), (ins memrix:$src),		def DFLOADf64 : PPCPostRAExpPseudo<(outs vsfrc:$XT), (ins memrix:$src),
"#DFLOADf64",		"#DFLOADf64",
[(set f64:$XT, (load iaddrX4:$src))]>;		[(set f64:$XT, (load DSForm:$src))]>;
def DFSTOREf32 : PPCPostRAExpPseudo<(outs), (ins vssrc:$XT, memrix:$dst),		def DFSTOREf32 : PPCPostRAExpPseudo<(outs), (ins vssrc:$XT, memrix:$dst),
"#DFSTOREf32",		"#DFSTOREf32",
[(store f32:$XT, iaddrX4:$dst)]>;		[(store f32:$XT, DSForm:$dst)]>;
def DFSTOREf64 : PPCPostRAExpPseudo<(outs), (ins vsfrc:$XT, memrix:$dst),		def DFSTOREf64 : PPCPostRAExpPseudo<(outs), (ins vsfrc:$XT, memrix:$dst),
"#DFSTOREf64",		"#DFSTOREf64",
[(store f64:$XT, iaddrX4:$dst)]>;		[(store f64:$XT, DSForm:$dst)]>;

let mayStore = 1 in {		let mayStore = 1 in {
def SPILLTOVSR_STX : PseudoXFormMemOp<(outs),		def SPILLTOVSR_STX : PseudoXFormMemOp<(outs),
(ins spilltovsrrc:$XT, memrr:$dst),		(ins spilltovsrrc:$XT, memrr:$dst),
"#SPILLTOVSR_STX", []>;		"#SPILLTOVSR_STX", []>;
def SPILLTOVSR_ST : PPCPostRAExpPseudo<(outs), (ins spilltovsrrc:$XT, memrix:$dst),		def SPILLTOVSR_ST : PPCPostRAExpPseudo<(outs), (ins spilltovsrrc:$XT, memrix:$dst),
"#SPILLTOVSR_ST", []>;		"#SPILLTOVSR_ST", []>;
}		}
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	dag F32Min = (COPY_TO_REGCLASS (XSMINDP (COPY_TO_REGCLASS $A, VSFRC),
(COPY_TO_REGCLASS $B, VSFRC)),		(COPY_TO_REGCLASS $B, VSFRC)),
VSSRC);		VSSRC);
dag F32Max = (COPY_TO_REGCLASS (XSMAXDP (COPY_TO_REGCLASS $A, VSFRC),		dag F32Max = (COPY_TO_REGCLASS (XSMAXDP (COPY_TO_REGCLASS $A, VSFRC),
(COPY_TO_REGCLASS $B, VSFRC)),		(COPY_TO_REGCLASS $B, VSFRC)),
VSSRC);		VSSRC);
}		}

def ScalarLoads {		def ScalarLoads {
dag Li8 = (i32 (extloadi8 xoaddr:$src));		dag Li8 = (i32 (extloadi8 ForceXForm:$src));
dag ZELi8 = (i32 (zextloadi8 xoaddr:$src));		dag ZELi8 = (i32 (zextloadi8 ForceXForm:$src));
dag ZELi8i64 = (i64 (zextloadi8 xoaddr:$src));		dag ZELi8i64 = (i64 (zextloadi8 ForceXForm:$src));
dag SELi8 = (i32 (sext_inreg (extloadi8 xoaddr:$src), i8));		dag SELi8 = (i32 (sext_inreg (extloadi8 ForceXForm:$src), i8));
dag SELi8i64 = (i64 (sext_inreg (extloadi8 xoaddr:$src), i8));		dag SELi8i64 = (i64 (sext_inreg (extloadi8 ForceXForm:$src), i8));

dag Li16 = (i32 (extloadi16 xoaddr:$src));		dag Li16 = (i32 (extloadi16 ForceXForm:$src));
dag ZELi16 = (i32 (zextloadi16 xoaddr:$src));		dag ZELi16 = (i32 (zextloadi16 ForceXForm:$src));
dag ZELi16i64 = (i64 (zextloadi16 xoaddr:$src));		dag ZELi16i64 = (i64 (zextloadi16 ForceXForm:$src));
dag SELi16 = (i32 (sextloadi16 xoaddr:$src));		dag SELi16 = (i32 (sextloadi16 ForceXForm:$src));
dag SELi16i64 = (i64 (sextloadi16 xoaddr:$src));		dag SELi16i64 = (i64 (sextloadi16 ForceXForm:$src));

dag Li32 = (i32 (load xoaddr:$src));		dag Li32 = (i32 (load ForceXForm:$src));
}		}

def DWToSPExtractConv {		def DWToSPExtractConv {
dag El0US1 = (f32 (PPCfcfidus		dag El0US1 = (f32 (PPCfcfidus
(f64 (PPCmtvsra (i64 (vector_extract v2i64:$S1, 0))))));		(f64 (PPCmtvsra (i64 (vector_extract v2i64:$S1, 0))))));
dag El1US1 = (f32 (PPCfcfidus		dag El1US1 = (f32 (PPCfcfidus
(f64 (PPCmtvsra (i64 (vector_extract v2i64:$S1, 1))))));		(f64 (PPCmtvsra (i64 (vector_extract v2i64:$S1, 1))))));
dag El0US2 = (f32 (PPCfcfidus		dag El0US2 = (f32 (PPCfcfidus
▲ Show 20 Lines • Show All 434 Lines • ▼ Show 20 Lines
def WordToDWord {		def WordToDWord {
dag LE_A0 = (i64 (sext (i32 (vector_extract v4i32:$A, 0))));		dag LE_A0 = (i64 (sext (i32 (vector_extract v4i32:$A, 0))));
dag LE_A1 = (i64 (sext (i32 (vector_extract v4i32:$A, 2))));		dag LE_A1 = (i64 (sext (i32 (vector_extract v4i32:$A, 2))));
dag BE_A0 = (i64 (sext (i32 (vector_extract v4i32:$A, 1))));		dag BE_A0 = (i64 (sext (i32 (vector_extract v4i32:$A, 1))));
dag BE_A1 = (i64 (sext (i32 (vector_extract v4i32:$A, 3))));		dag BE_A1 = (i64 (sext (i32 (vector_extract v4i32:$A, 3))));
}		}

def FltToIntLoad {		def FltToIntLoad {
dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (extloadf32 xoaddr:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (extloadf32 ForceXForm:$A)))));
}		}
def FltToUIntLoad {		def FltToUIntLoad {
dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (extloadf32 xoaddr:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (extloadf32 ForceXForm:$A)))));
}		}
def FltToLongLoad {		def FltToLongLoad {
dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (extloadf32 xoaddr:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (extloadf32 ForceXForm:$A)))));
}		}
def FltToLongLoadP9 {		def FltToLongLoadP9 {
dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (extloadf32 iaddrX4:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (extloadf32 DSForm:$A)))));
}		}
def FltToULongLoad {		def FltToULongLoad {
dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (extloadf32 xoaddr:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (extloadf32 ForceXForm:$A)))));
}		}
def FltToULongLoadP9 {		def FltToULongLoadP9 {
dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (extloadf32 iaddrX4:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (extloadf32 DSForm:$A)))));
}		}
def FltToLong {		def FltToLong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctidz (fpextend f32:$A)))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctidz (fpextend f32:$A)))));
}		}
def FltToULong {		def FltToULong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz (fpextend f32:$A)))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz (fpextend f32:$A)))));
}		}
def DblToInt {		def DblToInt {
Show All 10 Lines
}		}
def DblToLong {		def DblToLong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctidz f64:$A))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctidz f64:$A))));
}		}
def DblToULong {		def DblToULong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz f64:$A))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz f64:$A))));
}		}
def DblToIntLoad {		def DblToIntLoad {
dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (load xoaddr:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (load ForceXForm:$A)))));
}		}
def DblToIntLoadP9 {		def DblToIntLoadP9 {
dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (load iaddrX4:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwz (f64 (load DSForm:$A)))));
}		}
def DblToUIntLoad {		def DblToUIntLoad {
dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (load xoaddr:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (load ForceXForm:$A)))));
}		}
def DblToUIntLoadP9 {		def DblToUIntLoadP9 {
dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (load iaddrX4:$A)))));		dag A = (i32 (PPCmfvsr (PPCfctiwuz (f64 (load DSForm:$A)))));
}		}
def DblToLongLoad {		def DblToLongLoad {
dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (load xoaddr:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctidz (f64 (load ForceXForm:$A)))));
}		}
def DblToULongLoad {		def DblToULongLoad {
dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (load xoaddr:$A)))));		dag A = (i64 (PPCmfvsr (PPCfctiduz (f64 (load ForceXForm:$A)))));
}		}

// FP load dags (for f32 -> v4f32)		// FP load dags (for f32 -> v4f32)
def LoadFP {		def LoadFP {
dag A = (f32 (load xoaddr:$A));		dag A = (f32 (load ForceXForm:$A));
dag B = (f32 (load xoaddr:$B));		dag B = (f32 (load ForceXForm:$B));
dag C = (f32 (load xoaddr:$C));		dag C = (f32 (load ForceXForm:$C));
dag D = (f32 (load xoaddr:$D));		dag D = (f32 (load ForceXForm:$D));
}		}

// FP merge dags (for f32 -> v4f32)		// FP merge dags (for f32 -> v4f32)
def MrgFP {		def MrgFP {
dag LD32A = (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$A), sub_64);		dag LD32A = (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$A), sub_64);
dag LD32B = (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$B), sub_64);		dag LD32B = (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$B), sub_64);
dag LD32C = (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$C), sub_64);		dag LD32C = (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$C), sub_64);
dag LD32D = (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$D), sub_64);		dag LD32D = (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$D), sub_64);
dag AC = (XVCVDPSP (XXPERMDI (SUBREG_TO_REG (i64 1), $A, sub_64),		dag AC = (XVCVDPSP (XXPERMDI (SUBREG_TO_REG (i64 1), $A, sub_64),
(SUBREG_TO_REG (i64 1), $C, sub_64), 0));		(SUBREG_TO_REG (i64 1), $C, sub_64), 0));
dag BD = (XVCVDPSP (XXPERMDI (SUBREG_TO_REG (i64 1), $B, sub_64),		dag BD = (XVCVDPSP (XXPERMDI (SUBREG_TO_REG (i64 1), $B, sub_64),
(SUBREG_TO_REG (i64 1), $D, sub_64), 0));		(SUBREG_TO_REG (i64 1), $D, sub_64), 0));
dag ABhToFlt = (XVCVDPSP (XXPERMDI $A, $B, 0));		dag ABhToFlt = (XVCVDPSP (XXPERMDI $A, $B, 0));
dag ABlToFlt = (XVCVDPSP (XXPERMDI $A, $B, 3));		dag ABlToFlt = (XVCVDPSP (XXPERMDI $A, $B, 3));
dag BAhToFlt = (XVCVDPSP (XXPERMDI $B, $A, 0));		dag BAhToFlt = (XVCVDPSP (XXPERMDI $B, $A, 0));
dag BAlToFlt = (XVCVDPSP (XXPERMDI $B, $A, 3));		dag BAlToFlt = (XVCVDPSP (XXPERMDI $B, $A, 3));
▲ Show 20 Lines • Show All 355 Lines • ▼ Show 20 Lines	def : Pat<(f64 (fmaxnum_ieee f64:$A, f64:$B)),
(f64 (XSMAXDP $A, $B))>;		(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), f64:$B)),		def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), f64:$B)),
(f64 (XSMAXDP $A, $B))>;		(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee f64:$A, (fcanonicalize f64:$B))),		def : Pat<(f64 (fmaxnum_ieee f64:$A, (fcanonicalize f64:$B))),
(f64 (XSMAXDP $A, $B))>;		(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),		def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),
(f64 (XSMAXDP $A, $B))>;		(f64 (XSMAXDP $A, $B))>;

def : Pat<(int_ppc_vsx_stxvd2x_be v2f64:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvd2x_be v2f64:$rS, ForceXForm:$dst),
(STXVD2X $rS, xoaddr:$dst)>;		(STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(int_ppc_vsx_stxvw4x_be v4i32:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvw4x_be v4i32:$rS, ForceXForm:$dst),
(STXVW4X $rS, xoaddr:$dst)>;		(STXVW4X $rS, ForceXForm:$dst)>;
def : Pat<(v4i32 (int_ppc_vsx_lxvw4x_be xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (int_ppc_vsx_lxvw4x_be ForceXForm:$src)), (LXVW4X ForceXForm:$src)>;
def : Pat<(v2f64 (int_ppc_vsx_lxvd2x_be xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (int_ppc_vsx_lxvd2x_be ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;

// Rounding for single precision.		// Rounding for single precision.
def : Pat<(f32 (any_fround f32:$S)),		def : Pat<(f32 (any_fround f32:$S)),
(f32 (COPY_TO_REGCLASS (XSRDPI		(f32 (COPY_TO_REGCLASS (XSRDPI
(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;		(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;
def : Pat<(f32 (fnearbyint f32:$S)),		def : Pat<(f32 (fnearbyint f32:$S)),
(f32 (COPY_TO_REGCLASS (XSRDPIC		(f32 (COPY_TO_REGCLASS (XSRDPIC
(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;		(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;
Show All 29 Lines
def : Pat<(v2i64 (build_vector DblToLong.A, DblToLong.A)),		def : Pat<(v2i64 (build_vector DblToLong.A, DblToLong.A)),
(v2i64 (XXPERMDI (SUBREG_TO_REG (i64 1), (XSCVDPSXDS $A), sub_64),		(v2i64 (XXPERMDI (SUBREG_TO_REG (i64 1), (XSCVDPSXDS $A), sub_64),
(SUBREG_TO_REG (i64 1), (XSCVDPSXDS $A), sub_64), 0))>;		(SUBREG_TO_REG (i64 1), (XSCVDPSXDS $A), sub_64), 0))>;
def : Pat<(v2i64 (build_vector DblToULong.A, DblToULong.A)),		def : Pat<(v2i64 (build_vector DblToULong.A, DblToULong.A)),
(v2i64 (XXPERMDI (SUBREG_TO_REG (i64 1), (XSCVDPUXDS $A), sub_64),		(v2i64 (XXPERMDI (SUBREG_TO_REG (i64 1), (XSCVDPUXDS $A), sub_64),
(SUBREG_TO_REG (i64 1), (XSCVDPUXDS $A), sub_64), 0))>;		(SUBREG_TO_REG (i64 1), (XSCVDPUXDS $A), sub_64), 0))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, FltToIntLoad.A,		v4i32, FltToIntLoad.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWSs (XFLOADf32 xoaddr:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWSs (XFLOADf32 ForceXForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPSXWSs (XFLOADf32 xoaddr:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPSXWSs (XFLOADf32 ForceXForm:$A)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, FltToUIntLoad.A,		v4i32, FltToUIntLoad.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWSs (XFLOADf32 xoaddr:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWSs (XFLOADf32 ForceXForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPUXWSs (XFLOADf32 xoaddr:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPUXWSs (XFLOADf32 ForceXForm:$A)), sub_64)>;
def : Pat<(v4f32 (build_vector f32:$A, f32:$A, f32:$A, f32:$A)),		def : Pat<(v4f32 (build_vector f32:$A, f32:$A, f32:$A, f32:$A)),
(v4f32 (XXSPLTW (v4f32 (XSCVDPSPN $A)), 0))>;		(v4f32 (XXSPLTW (v4f32 (XSCVDPSPN $A)), 0))>;
def : Pat<(v2f64 (PPCldsplat xoaddr:$A)),		def : Pat<(v2f64 (PPCldsplat ForceXForm:$A)),
(v2f64 (LXVDSX xoaddr:$A))>;		(v2f64 (LXVDSX ForceXForm:$A))>;
def : Pat<(v2i64 (PPCldsplat xoaddr:$A)),		def : Pat<(v2i64 (PPCldsplat ForceXForm:$A)),
(v2i64 (LXVDSX xoaddr:$A))>;		(v2i64 (LXVDSX ForceXForm:$A))>;

// Build vectors of floating point converted to i64.		// Build vectors of floating point converted to i64.
def : Pat<(v2i64 (build_vector FltToLong.A, FltToLong.A)),		def : Pat<(v2i64 (build_vector FltToLong.A, FltToLong.A)),
(v2i64 (XXPERMDIs		(v2i64 (XXPERMDIs
(COPY_TO_REGCLASS (XSCVDPSXDSs $A), VSFRC), 0))>;		(COPY_TO_REGCLASS (XSCVDPSXDSs $A), VSFRC), 0))>;
def : Pat<(v2i64 (build_vector FltToULong.A, FltToULong.A)),		def : Pat<(v2i64 (build_vector FltToULong.A, FltToULong.A)),
(v2i64 (XXPERMDIs		(v2i64 (XXPERMDIs
(COPY_TO_REGCLASS (XSCVDPUXDSs $A), VSFRC), 0))>;		(COPY_TO_REGCLASS (XSCVDPUXDSs $A), VSFRC), 0))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, DblToLongLoad.A,		v2i64, DblToLongLoad.A,
(XVCVDPSXDS (LXVDSX xoaddr:$A)), (XVCVDPSXDS (LXVDSX xoaddr:$A))>;		(XVCVDPSXDS (LXVDSX ForceXForm:$A)), (XVCVDPSXDS (LXVDSX ForceXForm:$A))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, DblToULongLoad.A,		v2i64, DblToULongLoad.A,
(XVCVDPUXDS (LXVDSX xoaddr:$A)), (XVCVDPUXDS (LXVDSX xoaddr:$A))>;		(XVCVDPUXDS (LXVDSX ForceXForm:$A)), (XVCVDPUXDS (LXVDSX ForceXForm:$A))>;
} // HasVSX		} // HasVSX

// Any big endian VSX subtarget.		// Any big endian VSX subtarget.
let Predicates = [HasVSX, IsBigEndian] in {		let Predicates = [HasVSX, IsBigEndian] in {
def : Pat<(v2f64 (scalar_to_vector f64:$A)),		def : Pat<(v2f64 (scalar_to_vector f64:$A)),
(v2f64 (SUBREG_TO_REG (i64 1), $A, sub_64))>;		(v2f64 (SUBREG_TO_REG (i64 1), $A, sub_64))>;

def : Pat<(f64 (extractelt v2f64:$S, 0)),		def : Pat<(f64 (extractelt v2f64:$S, 0)),
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	defm : ScalToVecWPermute<v2f64, (f64 f64:$A),
(SUBREG_TO_REG (i64 1), $A, sub_64), 0),		(SUBREG_TO_REG (i64 1), $A, sub_64), 0),
(SUBREG_TO_REG (i64 1), $A, sub_64)>;		(SUBREG_TO_REG (i64 1), $A, sub_64)>;

def : Pat<(f64 (extractelt v2f64:$S, 0)),		def : Pat<(f64 (extractelt v2f64:$S, 0)),
(f64 (EXTRACT_SUBREG (XXPERMDI $S, $S, 2), sub_64))>;		(f64 (EXTRACT_SUBREG (XXPERMDI $S, $S, 2), sub_64))>;
def : Pat<(f64 (extractelt v2f64:$S, 1)),		def : Pat<(f64 (extractelt v2f64:$S, 1)),
(f64 (EXTRACT_SUBREG $S, sub_64))>;		(f64 (EXTRACT_SUBREG $S, sub_64))>;

def : Pat<(v2f64 (PPCld_vec_be xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (PPCld_vec_be ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;
def : Pat<(PPCst_vec_be v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(PPCst_vec_be v2f64:$rS, ForceXForm:$dst), (STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(v4f32 (PPCld_vec_be xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4f32 (PPCld_vec_be ForceXForm:$src)), (LXVW4X ForceXForm:$src)>;
def : Pat<(PPCst_vec_be v4f32:$rS, xoaddr:$dst), (STXVW4X $rS, xoaddr:$dst)>;		def : Pat<(PPCst_vec_be v4f32:$rS, ForceXForm:$dst), (STXVW4X $rS, ForceXForm:$dst)>;
def : Pat<(v2i64 (PPCld_vec_be xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2i64 (PPCld_vec_be ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;
def : Pat<(PPCst_vec_be v2i64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(PPCst_vec_be v2i64:$rS, ForceXForm:$dst), (STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(v4i32 (PPCld_vec_be xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (PPCld_vec_be ForceXForm:$src)), (LXVW4X ForceXForm:$src)>;
def : Pat<(PPCst_vec_be v4i32:$rS, xoaddr:$dst), (STXVW4X $rS, xoaddr:$dst)>;		def : Pat<(PPCst_vec_be v4i32:$rS, ForceXForm:$dst), (STXVW4X $rS, ForceXForm:$dst)>;
def : Pat<(f64 (PPCfcfid (PPCmtvsra (i64 (vector_extract v2i64:$S, 0))))),		def : Pat<(f64 (PPCfcfid (PPCmtvsra (i64 (vector_extract v2i64:$S, 0))))),
(f64 (XSCVSXDDP (COPY_TO_REGCLASS (XXPERMDI $S, $S, 2), VSFRC)))>;		(f64 (XSCVSXDDP (COPY_TO_REGCLASS (XXPERMDI $S, $S, 2), VSFRC)))>;
def : Pat<(f64 (PPCfcfid (PPCmtvsra (i64 (vector_extract v2i64:$S, 1))))),		def : Pat<(f64 (PPCfcfid (PPCmtvsra (i64 (vector_extract v2i64:$S, 1))))),
(f64 (XSCVSXDDP (COPY_TO_REGCLASS (f64 (COPY_TO_REGCLASS $S, VSRC)), VSFRC)))>;		(f64 (XSCVSXDDP (COPY_TO_REGCLASS (f64 (COPY_TO_REGCLASS $S, VSRC)), VSFRC)))>;
def : Pat<(f64 (PPCfcfidu (PPCmtvsra (i64 (vector_extract v2i64:$S, 0))))),		def : Pat<(f64 (PPCfcfidu (PPCmtvsra (i64 (vector_extract v2i64:$S, 0))))),
(f64 (XSCVUXDDP (COPY_TO_REGCLASS (XXPERMDI $S, $S, 2), VSFRC)))>;		(f64 (XSCVUXDDP (COPY_TO_REGCLASS (XXPERMDI $S, $S, 2), VSFRC)))>;
def : Pat<(f64 (PPCfcfidu (PPCmtvsra (i64 (vector_extract v2i64:$S, 1))))),		def : Pat<(f64 (PPCfcfidu (PPCmtvsra (i64 (vector_extract v2i64:$S, 1))))),
(f64 (XSCVUXDDP (COPY_TO_REGCLASS (f64 (COPY_TO_REGCLASS $S, VSRC)), VSFRC)))>;		(f64 (XSCVUXDDP (COPY_TO_REGCLASS (f64 (COPY_TO_REGCLASS $S, VSRC)), VSFRC)))>;
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	def : Pat<(v2f64 (insertelt v2f64:$A, f64:$B, 0)),
(v2f64 (XXPERMDI $A, (SUBREG_TO_REG (i64 1), $B, sub_64), 0))>;		(v2f64 (XXPERMDI $A, (SUBREG_TO_REG (i64 1), $B, sub_64), 0))>;
def : Pat<(v2f64 (insertelt v2f64:$A, f64:$B, 1)),		def : Pat<(v2f64 (insertelt v2f64:$A, f64:$B, 1)),
(v2f64 (XXPERMDI (SUBREG_TO_REG (i64 1), $B, sub_64), $A, 1))>;		(v2f64 (XXPERMDI (SUBREG_TO_REG (i64 1), $B, sub_64), $A, 1))>;
} // HasVSX, IsLittleEndian		} // HasVSX, IsLittleEndian

// Any pre-Power9 VSX subtarget.		// Any pre-Power9 VSX subtarget.
let Predicates = [HasVSX, NoP9Vector] in {		let Predicates = [HasVSX, NoP9Vector] in {
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), xoaddr:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 8),
(STXSDX (XSCVDPSXDS f64:$src), xoaddr:$dst)>;		(STXSDX (XSCVDPSXDS f64:$src), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), xoaddr:$dst, 8),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 8),
(STXSDX (XSCVDPUXDS f64:$src), xoaddr:$dst)>;		(STXSDX (XSCVDPUXDS f64:$src), ForceXForm:$dst)>;

// Load-and-splat with fp-to-int conversion (using X-Form VSX/FP loads).		// Load-and-splat with fp-to-int conversion (using X-Form VSX/FP loads).
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, DblToIntLoad.A,		v4i32, DblToIntLoad.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWS (XFLOADf64 xoaddr:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWS (XFLOADf64 ForceXForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPSXWS (XFLOADf64 xoaddr:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPSXWS (XFLOADf64 ForceXForm:$A)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, DblToUIntLoad.A,		v4i32, DblToUIntLoad.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWS (XFLOADf64 xoaddr:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWS (XFLOADf64 ForceXForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPUXWS (XFLOADf64 xoaddr:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPUXWS (XFLOADf64 ForceXForm:$A)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, FltToLongLoad.A,		v2i64, FltToLongLoad.A,
(XXPERMDIs (XSCVDPSXDS (COPY_TO_REGCLASS (XFLOADf32 xoaddr:$A), VSFRC)), 0),		(XXPERMDIs (XSCVDPSXDS (COPY_TO_REGCLASS (XFLOADf32 ForceXForm:$A), VSFRC)), 0),
(SUBREG_TO_REG (i64 1), (XSCVDPSXDS (COPY_TO_REGCLASS (XFLOADf32 xoaddr:$A),		(SUBREG_TO_REG (i64 1), (XSCVDPSXDS (COPY_TO_REGCLASS (XFLOADf32 ForceXForm:$A),
VSFRC)), sub_64)>;		VSFRC)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, FltToULongLoad.A,		v2i64, FltToULongLoad.A,
(XXPERMDIs (XSCVDPUXDS (COPY_TO_REGCLASS (XFLOADf32 xoaddr:$A), VSFRC)), 0),		(XXPERMDIs (XSCVDPUXDS (COPY_TO_REGCLASS (XFLOADf32 ForceXForm:$A), VSFRC)), 0),
(SUBREG_TO_REG (i64 1), (XSCVDPUXDS (COPY_TO_REGCLASS (XFLOADf32 xoaddr:$A),		(SUBREG_TO_REG (i64 1), (XSCVDPUXDS (COPY_TO_REGCLASS (XFLOADf32 ForceXForm:$A),
VSFRC)), sub_64)>;		VSFRC)), sub_64)>;
} // HasVSX, NoP9Vector		} // HasVSX, NoP9Vector

// Any little endian pre-Power9 VSX subtarget.		// Any little endian pre-Power9 VSX subtarget.
let Predicates = [HasVSX, NoP9Vector, IsLittleEndian] in {		let Predicates = [HasVSX, NoP9Vector, IsLittleEndian] in {
// Load-and-splat using only X-Form VSX loads.		// Load-and-splat using only X-Form VSX loads.
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, (i64 (load xoaddr:$src)),		v2i64, (i64 (load ForceXForm:$src)),
(XXPERMDIs (XFLOADf64 xoaddr:$src), 2),		(XXPERMDIs (XFLOADf64 ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2f64, (f64 (load xoaddr:$src)),		v2f64, (f64 (load ForceXForm:$src)),
(XXPERMDIs (XFLOADf64 xoaddr:$src), 2),		(XXPERMDIs (XFLOADf64 ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 ForceXForm:$src), sub_64)>;
} // HasVSX, NoP9Vector, IsLittleEndian		} // HasVSX, NoP9Vector, IsLittleEndian

// Any VSX subtarget that only has loads and stores that load in big endian		// Any VSX subtarget that only has loads and stores that load in big endian
// order regardless of endianness. This is really pre-Power9 subtargets.		// order regardless of endianness. This is really pre-Power9 subtargets.
let Predicates = [HasVSX, HasOnlySwappingMemOps] in {		let Predicates = [HasVSX, HasOnlySwappingMemOps] in {
def : Pat<(v2f64 (PPClxvd2x xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (PPClxvd2x ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;

// Stores.		// Stores.
def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, ForceXForm:$dst),
(STXVD2X $rS, xoaddr:$dst)>;		(STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(PPCstxvd2x v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(PPCstxvd2x v2f64:$rS, ForceXForm:$dst), (STXVD2X $rS, ForceXForm:$dst)>;
} // HasVSX, HasOnlySwappingMemOps		} // HasVSX, HasOnlySwappingMemOps

// Big endian VSX subtarget that only has loads and stores that always		// Big endian VSX subtarget that only has loads and stores that always
// load in big endian order. Really big endian pre-Power9 subtargets.		// load in big endian order. Really big endian pre-Power9 subtargets.
let Predicates = [HasVSX, HasOnlySwappingMemOps, IsBigEndian] in {		let Predicates = [HasVSX, HasOnlySwappingMemOps, IsBigEndian] in {
def : Pat<(v2f64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (load ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;
def : Pat<(v2i64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2i64 (load ForceXForm:$src)), (LXVD2X ForceXForm:$src)>;
def : Pat<(v4i32 (load xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (load ForceXForm:$src)), (LXVW4X ForceXForm:$src)>;
def : Pat<(v4i32 (int_ppc_vsx_lxvw4x xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (int_ppc_vsx_lxvw4x ForceXForm:$src)), (LXVW4X ForceXForm:$src)>;
def : Pat<(store v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(store v2f64:$rS, ForceXForm:$dst), (STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(store v2i64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(store v2i64:$rS, ForceXForm:$dst), (STXVD2X $rS, ForceXForm:$dst)>;
def : Pat<(store v4i32:$XT, xoaddr:$dst), (STXVW4X $XT, xoaddr:$dst)>;		def : Pat<(store v4i32:$XT, ForceXForm:$dst), (STXVW4X $XT, ForceXForm:$dst)>;
def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, ForceXForm:$dst),
(STXVW4X $rS, xoaddr:$dst)>;		(STXVW4X $rS, ForceXForm:$dst)>;
def : Pat<(v2i64 (scalar_to_vector (i64 (load xoaddr:$src)))),		def : Pat<(v2i64 (scalar_to_vector (i64 (load ForceXForm:$src)))),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 ForceXForm:$src), sub_64)>;
} // HasVSX, HasOnlySwappingMemOps, IsBigEndian		} // HasVSX, HasOnlySwappingMemOps, IsBigEndian

// Any Power8 VSX subtarget.		// Any Power8 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector] in {		let Predicates = [HasVSX, HasP8Vector] in {
def : Pat<(int_ppc_vsx_xxleqv v4i32:$A, v4i32:$B),		def : Pat<(int_ppc_vsx_xxleqv v4i32:$A, v4i32:$B),
(XXLEQV $A, $B)>;		(XXLEQV $A, $B)>;
def : Pat<(f64 (extloadf32 xoaddr:$src)),		def : Pat<(f64 (extloadf32 ForceXForm:$src)),
(COPY_TO_REGCLASS (XFLOADf32 xoaddr:$src), VSFRC)>;		(COPY_TO_REGCLASS (XFLOADf32 ForceXForm:$src), VSFRC)>;
def : Pat<(f32 (fpround (f64 (extloadf32 xoaddr:$src)))),		def : Pat<(f32 (fpround (f64 (extloadf32 ForceXForm:$src)))),
(f32 (XFLOADf32 xoaddr:$src))>;		(f32 (XFLOADf32 ForceXForm:$src))>;
def : Pat<(f64 (any_fpextend f32:$src)),		def : Pat<(f64 (any_fpextend f32:$src)),
(COPY_TO_REGCLASS $src, VSFRC)>;		(COPY_TO_REGCLASS $src, VSFRC)>;

def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETLT)),		def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETLT)),
(SELECT_VSSRC (CRANDC $lhs, $rhs), $tval, $fval)>;		(SELECT_VSSRC (CRANDC $lhs, $rhs), $tval, $fval)>;
def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETULT)),		def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETULT)),
(SELECT_VSSRC (CRANDC $rhs, $lhs), $tval, $fval)>;		(SELECT_VSSRC (CRANDC $rhs, $lhs), $tval, $fval)>;
def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETLE)),		def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETLE)),
Show All 26 Lines
// so that FNMSUBS can be selected for fneg-fmsub pattern on P7. (VSX version,		// so that FNMSUBS can be selected for fneg-fmsub pattern on P7. (VSX version,
// XSNMSUBASP, is available since P8)		// XSNMSUBASP, is available since P8)
def : Pat<(f32 (fneg f32:$S)),		def : Pat<(f32 (fneg f32:$S)),
(f32 (COPY_TO_REGCLASS (XSNEGDP		(f32 (COPY_TO_REGCLASS (XSNEGDP
(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;		(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;

// Instructions for converting float to i32 feeding a store.		// Instructions for converting float to i32 feeding a store.
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), xoaddr:$dst, 4),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 4),
(STIWX (XSCVDPSXWS f64:$src), xoaddr:$dst)>;		(STIWX (XSCVDPSXWS f64:$src), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), xoaddr:$dst, 4),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 4),
(STIWX (XSCVDPUXWS f64:$src), xoaddr:$dst)>;		(STIWX (XSCVDPUXWS f64:$src), ForceXForm:$dst)>;

def : Pat<(v2i64 (smax v2i64:$src1, v2i64:$src2)),		def : Pat<(v2i64 (smax v2i64:$src1, v2i64:$src2)),
(v2i64 (VMAXSD (COPY_TO_REGCLASS $src1, VRRC),		(v2i64 (VMAXSD (COPY_TO_REGCLASS $src1, VRRC),
(COPY_TO_REGCLASS $src2, VRRC)))>;		(COPY_TO_REGCLASS $src2, VRRC)))>;
def : Pat<(v2i64 (umax v2i64:$src1, v2i64:$src2)),		def : Pat<(v2i64 (umax v2i64:$src1, v2i64:$src2)),
(v2i64 (VMAXUD (COPY_TO_REGCLASS $src1, VRRC),		(v2i64 (VMAXUD (COPY_TO_REGCLASS $src1, VRRC),
(COPY_TO_REGCLASS $src2, VRRC)))>;		(COPY_TO_REGCLASS $src2, VRRC)))>;
def : Pat<(v2i64 (smin v2i64:$src1, v2i64:$src2)),		def : Pat<(v2i64 (smin v2i64:$src1, v2i64:$src2)),
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
// Big endian Power8 64Bit VSX subtarget.		// Big endian Power8 64Bit VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, IsBigEndian, IsPPC64] in {		let Predicates = [HasVSX, HasP8Vector, IsBigEndian, IsPPC64] in {
def : Pat<(f32 (vector_extract v4f32:$S, i64:$Idx)),		def : Pat<(f32 (vector_extract v4f32:$S, i64:$Idx)),
(f32 VectorExtractions.BE_VARIABLE_FLOAT)>;		(f32 VectorExtractions.BE_VARIABLE_FLOAT)>;

// LIWAX - This instruction is used for sign extending i32 -> i64.		// LIWAX - This instruction is used for sign extending i32 -> i64.
// LIWZX - This instruction will be emitted for i32, f32, and when		// LIWZX - This instruction will be emitted for i32, f32, and when
// zero-extending i32 to i64 (zext i32 -> i64).		// zero-extending i32 to i64 (zext i32 -> i64).
def : Pat<(v2i64 (scalar_to_vector (i64 (sextloadi32 xoaddr:$src)))),		def : Pat<(v2i64 (scalar_to_vector (i64 (sextloadi32 ForceXForm:$src)))),
(v2i64 (SUBREG_TO_REG (i64 1), (LIWAX xoaddr:$src), sub_64))>;		(v2i64 (SUBREG_TO_REG (i64 1), (LIWAX ForceXForm:$src), sub_64))>;
def : Pat<(v2i64 (scalar_to_vector (i64 (zextloadi32 xoaddr:$src)))),		def : Pat<(v2i64 (scalar_to_vector (i64 (zextloadi32 ForceXForm:$src)))),
(v2i64 (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64))>;		(v2i64 (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, (i32 (load xoaddr:$src)),		v4i32, (i32 (load ForceXForm:$src)),
(XXSLDWIs (LIWZX xoaddr:$src), 1),		(XXSLDWIs (LIWZX ForceXForm:$src), 1),
(SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4f32, (f32 (load xoaddr:$src)),		v4f32, (f32 (load ForceXForm:$src)),
(XXSLDWIs (LIWZX xoaddr:$src), 1),		(XXSLDWIs (LIWZX ForceXForm:$src), 1),
(SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64)>;

def : Pat<DWToSPExtractConv.BVU,		def : Pat<DWToSPExtractConv.BVU,
(v4f32 (VPKUDUM (XXSLDWI (XVCVUXDSP $S1), (XVCVUXDSP $S1), 3),		(v4f32 (VPKUDUM (XXSLDWI (XVCVUXDSP $S1), (XVCVUXDSP $S1), 3),
(XXSLDWI (XVCVUXDSP $S2), (XVCVUXDSP $S2), 3)))>;		(XXSLDWI (XVCVUXDSP $S2), (XVCVUXDSP $S2), 3)))>;
def : Pat<DWToSPExtractConv.BVS,		def : Pat<DWToSPExtractConv.BVS,
(v4f32 (VPKUDUM (XXSLDWI (XVCVSXDSP $S1), (XVCVSXDSP $S1), 3),		(v4f32 (VPKUDUM (XXSLDWI (XVCVSXDSP $S1), (XVCVSXDSP $S1), 3),
(XXSLDWI (XVCVSXDSP $S2), (XVCVSXDSP $S2), 3)))>;		(XXSLDWI (XVCVSXDSP $S2), (XVCVSXDSP $S2), 3)))>;
def : Pat<(store (i32 (extractelt v4i32:$A, 1)), xoaddr:$src),		def : Pat<(store (i32 (extractelt v4i32:$A, 1)), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(STIWX (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
def : Pat<(store (f32 (extractelt v4f32:$A, 1)), xoaddr:$src),		def : Pat<(store (f32 (extractelt v4f32:$A, 1)), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(STIWX (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;

// Elements in a register on a BE system are in order <0, 1, 2, 3>.		// Elements in a register on a BE system are in order <0, 1, 2, 3>.
// The store instructions store the second word from the left.		// The store instructions store the second word from the left.
// So to align element zero, we need to modulo-left-shift by 3 words.		// So to align element zero, we need to modulo-left-shift by 3 words.
// Similar logic applies for elements 2 and 3.		// Similar logic applies for elements 2 and 3.
foreach Idx = [ [0,3], [2,1], [3,2] ] in {		foreach Idx = [ [0,3], [2,1], [3,2] ] in {
def : Pat<(store (i32 (extractelt v4i32:$A, !head(Idx))), xoaddr:$src),		def : Pat<(store (i32 (extractelt v4i32:$A, !head(Idx))), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),		(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),
sub_64), xoaddr:$src)>;		sub_64), ForceXForm:$src)>;
def : Pat<(store (f32 (extractelt v4f32:$A, !head(Idx))), xoaddr:$src),		def : Pat<(store (f32 (extractelt v4f32:$A, !head(Idx))), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),		(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),
sub_64), xoaddr:$src)>;		sub_64), ForceXForm:$src)>;
}		}
} // HasVSX, HasP8Vector, IsBigEndian, IsPPC64		} // HasVSX, HasP8Vector, IsBigEndian, IsPPC64

// Little endian Power8 VSX subtarget.		// Little endian Power8 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, IsLittleEndian] in {		let Predicates = [HasVSX, HasP8Vector, IsLittleEndian] in {
def : Pat<DWToSPExtractConv.El0SS1,		def : Pat<DWToSPExtractConv.El0SS1,
(f32 (XSCVSXDSP (COPY_TO_REGCLASS (XXPERMDI $S1, $S1, 2), VSFRC)))>;		(f32 (XSCVSXDSP (COPY_TO_REGCLASS (XXPERMDI $S1, $S1, 2), VSFRC)))>;
def : Pat<DWToSPExtractConv.El1SS1,		def : Pat<DWToSPExtractConv.El1SS1,
Show All 36 Lines	def : Pat<(f64 (PPCfcfid (f64 (PPCmtvsra (i32 (extractelt v4i32:$A, 2)))))),
(f64 (COPY_TO_REGCLASS (XVCVSXWDP (XXSPLTW $A, 1)), VSFRC))>;		(f64 (COPY_TO_REGCLASS (XVCVSXWDP (XXSPLTW $A, 1)), VSFRC))>;
def : Pat<(f64 (PPCfcfid (f64 (PPCmtvsra (i32 (extractelt v4i32:$A, 3)))))),		def : Pat<(f64 (PPCfcfid (f64 (PPCmtvsra (i32 (extractelt v4i32:$A, 3)))))),
(f64 (COPY_TO_REGCLASS (XVCVSXWDP (XXSPLTW $A, 0)), VSFRC))>;		(f64 (COPY_TO_REGCLASS (XVCVSXWDP (XXSPLTW $A, 0)), VSFRC))>;

// LIWAX - This instruction is used for sign extending i32 -> i64.		// LIWAX - This instruction is used for sign extending i32 -> i64.
// LIWZX - This instruction will be emitted for i32, f32, and when		// LIWZX - This instruction will be emitted for i32, f32, and when
// zero-extending i32 to i64 (zext i32 -> i64).		// zero-extending i32 to i64 (zext i32 -> i64).
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, (i64 (sextloadi32 xoaddr:$src)),		v2i64, (i64 (sextloadi32 ForceXForm:$src)),
(XXPERMDIs (LIWAX xoaddr:$src), 2),		(XXPERMDIs (LIWAX ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (LIWAX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWAX ForceXForm:$src), sub_64)>;

defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, (i64 (zextloadi32 xoaddr:$src)),		v2i64, (i64 (zextloadi32 ForceXForm:$src)),
(XXPERMDIs (LIWZX xoaddr:$src), 2),		(XXPERMDIs (LIWZX ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64)>;

defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, (i32 (load xoaddr:$src)),		v4i32, (i32 (load ForceXForm:$src)),
(XXPERMDIs (LIWZX xoaddr:$src), 2),		(XXPERMDIs (LIWZX ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64)>;

defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4f32, (f32 (load xoaddr:$src)),		v4f32, (f32 (load ForceXForm:$src)),
(XXPERMDIs (LIWZX xoaddr:$src), 2),		(XXPERMDIs (LIWZX ForceXForm:$src), 2),
(SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$src), sub_64)>;

def : Pat<DWToSPExtractConv.BVU,		def : Pat<DWToSPExtractConv.BVU,
(v4f32 (VPKUDUM (XXSLDWI (XVCVUXDSP $S2), (XVCVUXDSP $S2), 3),		(v4f32 (VPKUDUM (XXSLDWI (XVCVUXDSP $S2), (XVCVUXDSP $S2), 3),
(XXSLDWI (XVCVUXDSP $S1), (XVCVUXDSP $S1), 3)))>;		(XXSLDWI (XVCVUXDSP $S1), (XVCVUXDSP $S1), 3)))>;
def : Pat<DWToSPExtractConv.BVS,		def : Pat<DWToSPExtractConv.BVS,
(v4f32 (VPKUDUM (XXSLDWI (XVCVSXDSP $S2), (XVCVSXDSP $S2), 3),		(v4f32 (VPKUDUM (XXSLDWI (XVCVSXDSP $S2), (XVCVSXDSP $S2), 3),
(XXSLDWI (XVCVSXDSP $S1), (XVCVSXDSP $S1), 3)))>;		(XXSLDWI (XVCVSXDSP $S1), (XVCVSXDSP $S1), 3)))>;
def : Pat<(store (i32 (extractelt v4i32:$A, 2)), xoaddr:$src),		def : Pat<(store (i32 (extractelt v4i32:$A, 2)), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(STIWX (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
def : Pat<(store (f32 (extractelt v4f32:$A, 2)), xoaddr:$src),		def : Pat<(store (f32 (extractelt v4f32:$A, 2)), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(STIWX (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;

// Elements in a register on a LE system are in order <3, 2, 1, 0>.		// Elements in a register on a LE system are in order <3, 2, 1, 0>.
// The store instructions store the second word from the left.		// The store instructions store the second word from the left.
// So to align element 3, we need to modulo-left-shift by 3 words.		// So to align element 3, we need to modulo-left-shift by 3 words.
// Similar logic applies for elements 0 and 1.		// Similar logic applies for elements 0 and 1.
foreach Idx = [ [0,2], [1,1], [3,3] ] in {		foreach Idx = [ [0,2], [1,1], [3,3] ] in {
def : Pat<(store (i32 (extractelt v4i32:$A, !head(Idx))), xoaddr:$src),		def : Pat<(store (i32 (extractelt v4i32:$A, !head(Idx))), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),		(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),
sub_64), xoaddr:$src)>;		sub_64), ForceXForm:$src)>;
def : Pat<(store (f32 (extractelt v4f32:$A, !head(Idx))), xoaddr:$src),		def : Pat<(store (f32 (extractelt v4f32:$A, !head(Idx))), ForceXForm:$src),
(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),		(STIWX (EXTRACT_SUBREG (XXSLDWI $A, $A, !head(!tail(Idx))),
sub_64), xoaddr:$src)>;		sub_64), ForceXForm:$src)>;
}		}
} // HasVSX, HasP8Vector, IsLittleEndian		} // HasVSX, HasP8Vector, IsLittleEndian

// Big endian pre-Power9 VSX subtarget.		// Big endian pre-Power9 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian, IsPPC64] in {		let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian, IsPPC64] in {
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		ForceXForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		ForceXForm:$src)>;
} // HasVSX, HasP8Vector, NoP9Vector, IsBigEndian, IsPPC64		} // HasVSX, HasP8Vector, NoP9Vector, IsBigEndian, IsPPC64

// Little endian pre-Power9 VSX subtarget.		// Little endian pre-Power9 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian] in {		let Predicates = [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian] in {
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		ForceXForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
xoaddr:$src)>;		ForceXForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xoaddr:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xoaddr:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), ForceXForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xoaddr:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), ForceXForm:$src)>;
} // HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian		} // HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian

// Any VSX target with direct moves.		// Any VSX target with direct moves.
let Predicates = [HasVSX, HasDirectMove] in {		let Predicates = [HasVSX, HasDirectMove] in {
// bitconvert f32 -> i32		// bitconvert f32 -> i32
// (convert to 32-bit fp single, shift right 1 word, move to GPR)		// (convert to 32-bit fp single, shift right 1 word, move to GPR)
def : Pat<(i32 (bitconvert f32:$A)), Bitcast.FltToInt>;		def : Pat<(i32 (bitconvert f32:$A)), Bitcast.FltToInt>;

▲ Show 20 Lines • Show All 288 Lines • ▼ Show 20 Lines
def : Pat<(f128 (any_uint_to_fp i64:$src)),		def : Pat<(f128 (any_uint_to_fp i64:$src)),
(f128 (XSCVUDQP (COPY_TO_REGCLASS $src, VFRC)))>;		(f128 (XSCVUDQP (COPY_TO_REGCLASS $src, VFRC)))>;
def : Pat<(f128 (any_uint_to_fp (i64 (PPCmfvsr f64:$src)))),		def : Pat<(f128 (any_uint_to_fp (i64 (PPCmfvsr f64:$src)))),
(f128 (XSCVUDQP $src))>;		(f128 (XSCVUDQP $src))>;

// Convert (Un)Signed Word -> QP.		// Convert (Un)Signed Word -> QP.
def : Pat<(f128 (any_sint_to_fp i32:$src)),		def : Pat<(f128 (any_sint_to_fp i32:$src)),
(f128 (XSCVSDQP (MTVSRWA $src)))>;		(f128 (XSCVSDQP (MTVSRWA $src)))>;
def : Pat<(f128 (any_sint_to_fp (i32 (load xoaddr:$src)))),		def : Pat<(f128 (any_sint_to_fp (i32 (load ForceXForm:$src)))),
(f128 (XSCVSDQP (LIWAX xoaddr:$src)))>;		(f128 (XSCVSDQP (LIWAX ForceXForm:$src)))>;
def : Pat<(f128 (any_uint_to_fp i32:$src)),		def : Pat<(f128 (any_uint_to_fp i32:$src)),
(f128 (XSCVUDQP (MTVSRWZ $src)))>;		(f128 (XSCVUDQP (MTVSRWZ $src)))>;
def : Pat<(f128 (any_uint_to_fp (i32 (load xoaddr:$src)))),		def : Pat<(f128 (any_uint_to_fp (i32 (load ForceXForm:$src)))),
(f128 (XSCVUDQP (LIWZX xoaddr:$src)))>;		(f128 (XSCVUDQP (LIWZX ForceXForm:$src)))>;

// Pattern for matching Vector HP -> Vector SP intrinsic. Defined as a		// Pattern for matching Vector HP -> Vector SP intrinsic. Defined as a
// separate pattern so that it can convert the input register class from		// separate pattern so that it can convert the input register class from
// VRRC(v8i16) to VSRC.		// VRRC(v8i16) to VSRC.
def : Pat<(v4f32 (int_ppc_vsx_xvcvhpsp v8i16:$A)),		def : Pat<(v4f32 (int_ppc_vsx_xvcvhpsp v8i16:$A)),
(v4f32 (XVCVHPSP (COPY_TO_REGCLASS $A, VSRC)))>;		(v4f32 (XVCVHPSP (COPY_TO_REGCLASS $A, VSRC)))>;

// Use current rounding mode		// Use current rounding mode
Show All 24 Lines

// Vector Reverse		// Vector Reverse
def : Pat<(v8i16 (bswap v8i16 :$A)),		def : Pat<(v8i16 (bswap v8i16 :$A)),
(v8i16 (COPY_TO_REGCLASS (XXBRH (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;		(v8i16 (COPY_TO_REGCLASS (XXBRH (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;
def : Pat<(v1i128 (bswap v1i128 :$A)),		def : Pat<(v1i128 (bswap v1i128 :$A)),
(v1i128 (COPY_TO_REGCLASS (XXBRQ (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;		(v1i128 (COPY_TO_REGCLASS (XXBRQ (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;

// D-Form Load/Store		// D-Form Load/Store
def : Pat<(v4i32 (quadwOffsetLoad iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v4i32 (quadwOffsetLoad DQForm:$src)), (LXV memrix16:$src)>;
def : Pat<(v4f32 (quadwOffsetLoad iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v4f32 (quadwOffsetLoad DQForm:$src)), (LXV memrix16:$src)>;
def : Pat<(v2i64 (quadwOffsetLoad iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v2i64 (quadwOffsetLoad DQForm:$src)), (LXV memrix16:$src)>;
def : Pat<(v2f64 (quadwOffsetLoad iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v2f64 (quadwOffsetLoad DQForm:$src)), (LXV memrix16:$src)>;
def : Pat<(f128 (quadwOffsetLoad iaddrX16:$src)),		def : Pat<(f128 (quadwOffsetLoad DQForm:$src)),
(COPY_TO_REGCLASS (LXV memrix16:$src), VRRC)>;		(COPY_TO_REGCLASS (LXV memrix16:$src), VRRC)>;
def : Pat<(v4i32 (int_ppc_vsx_lxvw4x iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v4i32 (int_ppc_vsx_lxvw4x DQForm:$src)), (LXV memrix16:$src)>;
def : Pat<(v2f64 (int_ppc_vsx_lxvd2x iaddrX16:$src)), (LXV memrix16:$src)>;		def : Pat<(v2f64 (int_ppc_vsx_lxvd2x DQForm:$src)), (LXV memrix16:$src)>;

def : Pat<(quadwOffsetStore v4f32:$rS, iaddrX16:$dst), (STXV $rS, memrix16:$dst)>;		def : Pat<(quadwOffsetStore v4f32:$rS, DQForm:$dst), (STXV $rS, memrix16:$dst)>;
def : Pat<(quadwOffsetStore v4i32:$rS, iaddrX16:$dst), (STXV $rS, memrix16:$dst)>;		def : Pat<(quadwOffsetStore v4i32:$rS, DQForm:$dst), (STXV $rS, memrix16:$dst)>;
def : Pat<(quadwOffsetStore v2f64:$rS, iaddrX16:$dst), (STXV $rS, memrix16:$dst)>;		def : Pat<(quadwOffsetStore v2f64:$rS, DQForm:$dst), (STXV $rS, memrix16:$dst)>;
def : Pat<(quadwOffsetStore f128:$rS, iaddrX16:$dst),		def : Pat<(quadwOffsetStore f128:$rS, DQForm:$dst),
(STXV (COPY_TO_REGCLASS $rS, VSRC), memrix16:$dst)>;		(STXV (COPY_TO_REGCLASS $rS, VSRC), memrix16:$dst)>;
def : Pat<(quadwOffsetStore v2i64:$rS, iaddrX16:$dst), (STXV $rS, memrix16:$dst)>;		def : Pat<(quadwOffsetStore v2i64:$rS, DQForm:$dst), (STXV $rS, memrix16:$dst)>;
def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, iaddrX16:$dst),		def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, DQForm:$dst),
(STXV $rS, memrix16:$dst)>;		(STXV $rS, memrix16:$dst)>;
def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, iaddrX16:$dst),		def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, DQForm:$dst),
(STXV $rS, memrix16:$dst)>;		(STXV $rS, memrix16:$dst)>;

def : Pat<(v2f64 (nonQuadwOffsetLoad xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v2f64 (nonQuadwOffsetLoad ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(v2i64 (nonQuadwOffsetLoad xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v2i64 (nonQuadwOffsetLoad ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(v4f32 (nonQuadwOffsetLoad xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v4f32 (nonQuadwOffsetLoad ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(v4i32 (nonQuadwOffsetLoad xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v4i32 (nonQuadwOffsetLoad ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(v4i32 (int_ppc_vsx_lxvw4x xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v4i32 (int_ppc_vsx_lxvw4x ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(v2f64 (int_ppc_vsx_lxvd2x xoaddr:$src)), (LXVX xoaddr:$src)>;		def : Pat<(v2f64 (int_ppc_vsx_lxvd2x ForceXForm:$src)), (LXVX ForceXForm:$src)>;
def : Pat<(f128 (nonQuadwOffsetLoad xoaddr:$src)),		def : Pat<(f128 (nonQuadwOffsetLoad ForceXForm:$src)),
(COPY_TO_REGCLASS (LXVX xoaddr:$src), VRRC)>;		(COPY_TO_REGCLASS (LXVX ForceXForm:$src), VRRC)>;
def : Pat<(nonQuadwOffsetStore f128:$rS, xoaddr:$dst),		def : Pat<(nonQuadwOffsetStore f128:$rS, ForceXForm:$dst),
(STXVX (COPY_TO_REGCLASS $rS, VSRC), xoaddr:$dst)>;		(STXVX (COPY_TO_REGCLASS $rS, VSRC), ForceXForm:$dst)>;
def : Pat<(nonQuadwOffsetStore v2f64:$rS, xoaddr:$dst),		def : Pat<(nonQuadwOffsetStore v2f64:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;
def : Pat<(nonQuadwOffsetStore v2i64:$rS, xoaddr:$dst),		def : Pat<(nonQuadwOffsetStore v2i64:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;
def : Pat<(nonQuadwOffsetStore v4f32:$rS, xoaddr:$dst),		def : Pat<(nonQuadwOffsetStore v4f32:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;
def : Pat<(nonQuadwOffsetStore v4i32:$rS, xoaddr:$dst),		def : Pat<(nonQuadwOffsetStore v4i32:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;
def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvw4x v4i32:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;
def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, ForceXForm:$dst),
(STXVX $rS, xoaddr:$dst)>;		(STXVX $rS, ForceXForm:$dst)>;

// Build vectors from i8 loads		// Build vectors from i8 loads
defm : ScalToVecWPermute<v8i16, ScalarLoads.ZELi8,		defm : ScalToVecWPermute<v8i16, ScalarLoads.ZELi8,
(VSPLTHs 3, (LXSIBZX xoaddr:$src)),		(VSPLTHs 3, (LXSIBZX ForceXForm:$src)),
(SUBREG_TO_REG (i64 1), (LXSIBZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIBZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<v4i32, ScalarLoads.ZELi8,		defm : ScalToVecWPermute<v4i32, ScalarLoads.ZELi8,
(XXSPLTWs (LXSIBZX xoaddr:$src), 1),		(XXSPLTWs (LXSIBZX ForceXForm:$src), 1),
(SUBREG_TO_REG (i64 1), (LXSIBZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIBZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<v2i64, ScalarLoads.ZELi8i64,		defm : ScalToVecWPermute<v2i64, ScalarLoads.ZELi8i64,
(XXPERMDIs (LXSIBZX xoaddr:$src), 0),		(XXPERMDIs (LXSIBZX ForceXForm:$src), 0),
(SUBREG_TO_REG (i64 1), (LXSIBZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIBZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, ScalarLoads.SELi8,		v4i32, ScalarLoads.SELi8,
(XXSPLTWs (VEXTSB2Ws (LXSIBZX xoaddr:$src)), 1),		(XXSPLTWs (VEXTSB2Ws (LXSIBZX ForceXForm:$src)), 1),
(SUBREG_TO_REG (i64 1), (VEXTSB2Ws (LXSIBZX xoaddr:$src)), sub_64)>;		(SUBREG_TO_REG (i64 1), (VEXTSB2Ws (LXSIBZX ForceXForm:$src)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, ScalarLoads.SELi8i64,		v2i64, ScalarLoads.SELi8i64,
(XXPERMDIs (VEXTSB2Ds (LXSIBZX xoaddr:$src)), 0),		(XXPERMDIs (VEXTSB2Ds (LXSIBZX ForceXForm:$src)), 0),
(SUBREG_TO_REG (i64 1), (VEXTSB2Ds (LXSIBZX xoaddr:$src)), sub_64)>;		(SUBREG_TO_REG (i64 1), (VEXTSB2Ds (LXSIBZX ForceXForm:$src)), sub_64)>;

// Build vectors from i16 loads		// Build vectors from i16 loads
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, ScalarLoads.ZELi16,		v4i32, ScalarLoads.ZELi16,
(XXSPLTWs (LXSIHZX xoaddr:$src), 1),		(XXSPLTWs (LXSIHZX ForceXForm:$src), 1),
(SUBREG_TO_REG (i64 1), (LXSIHZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIHZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, ScalarLoads.ZELi16i64,		v2i64, ScalarLoads.ZELi16i64,
(XXPERMDIs (LXSIHZX xoaddr:$src), 0),		(XXPERMDIs (LXSIHZX ForceXForm:$src), 0),
(SUBREG_TO_REG (i64 1), (LXSIHZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIHZX ForceXForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, ScalarLoads.SELi16,		v4i32, ScalarLoads.SELi16,
(XXSPLTWs (VEXTSH2Ws (LXSIHZX xoaddr:$src)), 1),		(XXSPLTWs (VEXTSH2Ws (LXSIHZX ForceXForm:$src)), 1),
(SUBREG_TO_REG (i64 1), (VEXTSH2Ws (LXSIHZX xoaddr:$src)), sub_64)>;		(SUBREG_TO_REG (i64 1), (VEXTSH2Ws (LXSIHZX ForceXForm:$src)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, ScalarLoads.SELi16i64,		v2i64, ScalarLoads.SELi16i64,
(XXPERMDIs (VEXTSH2Ds (LXSIHZX xoaddr:$src)), 0),		(XXPERMDIs (VEXTSH2Ds (LXSIHZX ForceXForm:$src)), 0),
(SUBREG_TO_REG (i64 1), (VEXTSH2Ds (LXSIHZX xoaddr:$src)), sub_64)>;		(SUBREG_TO_REG (i64 1), (VEXTSH2Ds (LXSIHZX ForceXForm:$src)), sub_64)>;

// Load/convert and convert/store patterns for f16.		// Load/convert and convert/store patterns for f16.
def : Pat<(f64 (extloadf16 xoaddr:$src)),		def : Pat<(f64 (extloadf16 ForceXForm:$src)),
(f64 (XSCVHPDP (LXSIHZX xoaddr:$src)))>;		(f64 (XSCVHPDP (LXSIHZX ForceXForm:$src)))>;
def : Pat<(truncstoref16 f64:$src, xoaddr:$dst),		def : Pat<(truncstoref16 f64:$src, ForceXForm:$dst),
(STXSIHX (XSCVDPHP $src), xoaddr:$dst)>;		(STXSIHX (XSCVDPHP $src), ForceXForm:$dst)>;
def : Pat<(f32 (extloadf16 xoaddr:$src)),		def : Pat<(f32 (extloadf16 ForceXForm:$src)),
(f32 (COPY_TO_REGCLASS (XSCVHPDP (LXSIHZX xoaddr:$src)), VSSRC))>;		(f32 (COPY_TO_REGCLASS (XSCVHPDP (LXSIHZX ForceXForm:$src)), VSSRC))>;
def : Pat<(truncstoref16 f32:$src, xoaddr:$dst),		def : Pat<(truncstoref16 f32:$src, ForceXForm:$dst),
(STXSIHX (XSCVDPHP (COPY_TO_REGCLASS $src, VSFRC)), xoaddr:$dst)>;		(STXSIHX (XSCVDPHP (COPY_TO_REGCLASS $src, VSFRC)), ForceXForm:$dst)>;
def : Pat<(f64 (f16_to_fp i32:$A)),		def : Pat<(f64 (f16_to_fp i32:$A)),
(f64 (XSCVHPDP (MTVSRWZ $A)))>;		(f64 (XSCVHPDP (MTVSRWZ $A)))>;
def : Pat<(f32 (f16_to_fp i32:$A)),		def : Pat<(f32 (f16_to_fp i32:$A)),
(f32 (COPY_TO_REGCLASS (XSCVHPDP (MTVSRWZ $A)), VSSRC))>;		(f32 (COPY_TO_REGCLASS (XSCVHPDP (MTVSRWZ $A)), VSSRC))>;
def : Pat<(i32 (fp_to_f16 f32:$A)),		def : Pat<(i32 (fp_to_f16 f32:$A)),
(i32 (MFVSRWZ (XSCVDPHP (COPY_TO_REGCLASS $A, VSFRC))))>;		(i32 (MFVSRWZ (XSCVDPHP (COPY_TO_REGCLASS $A, VSFRC))))>;
def : Pat<(i32 (fp_to_f16 f64:$A)), (i32 (MFVSRWZ (XSCVDPHP $A)))>;		def : Pat<(i32 (fp_to_f16 f64:$A)), (i32 (MFVSRWZ (XSCVDPHP $A)))>;

// Vector sign extensions		// Vector sign extensions
def : Pat<(f64 (PPCVexts f64:$A, 1)),		def : Pat<(f64 (PPCVexts f64:$A, 1)),
(f64 (COPY_TO_REGCLASS (VEXTSB2Ds $A), VSFRC))>;		(f64 (COPY_TO_REGCLASS (VEXTSB2Ds $A), VSFRC))>;
def : Pat<(f64 (PPCVexts f64:$A, 2)),		def : Pat<(f64 (PPCVexts f64:$A, 2)),
(f64 (COPY_TO_REGCLASS (VEXTSH2Ds $A), VSFRC))>;		(f64 (COPY_TO_REGCLASS (VEXTSH2Ds $A), VSFRC))>;

def : Pat<(f64 (extloadf32 iaddrX4:$src)),		def : Pat<(f64 (extloadf32 DSForm:$src)),
(COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$src), VSFRC)>;		(COPY_TO_REGCLASS (DFLOADf32 DSForm:$src), VSFRC)>;
def : Pat<(f32 (fpround (f64 (extloadf32 iaddrX4:$src)))),		def : Pat<(f32 (fpround (f64 (extloadf32 DSForm:$src)))),
(f32 (DFLOADf32 iaddrX4:$src))>;		(f32 (DFLOADf32 DSForm:$src))>;

def : Pat<(v4f32 (PPCldvsxlh xaddr:$src)),		def : Pat<(v4f32 (PPCldvsxlh XForm:$src)),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 XForm:$src), sub_64)>;
def : Pat<(v4f32 (PPCldvsxlh iaddrX4:$src)),		def : Pat<(v4f32 (PPCldvsxlh DSForm:$src)),
(SUBREG_TO_REG (i64 1), (DFLOADf64 iaddrX4:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (DFLOADf64 DSForm:$src), sub_64)>;

// Convert (Un)Signed DWord in memory -> QP		// Convert (Un)Signed DWord in memory -> QP
def : Pat<(f128 (sint_to_fp (i64 (load xaddrX4:$src)))),		def : Pat<(f128 (sint_to_fp (i64 (load XForm:$src)))),
(f128 (XSCVSDQP (LXSDX xaddrX4:$src)))>;		(f128 (XSCVSDQP (LXSDX XForm:$src)))>;
def : Pat<(f128 (sint_to_fp (i64 (load iaddrX4:$src)))),		def : Pat<(f128 (sint_to_fp (i64 (load DSForm:$src)))),
(f128 (XSCVSDQP (LXSD iaddrX4:$src)))>;		(f128 (XSCVSDQP (LXSD DSForm:$src)))>;
def : Pat<(f128 (uint_to_fp (i64 (load xaddrX4:$src)))),		def : Pat<(f128 (uint_to_fp (i64 (load XForm:$src)))),
(f128 (XSCVUDQP (LXSDX xaddrX4:$src)))>;		(f128 (XSCVUDQP (LXSDX XForm:$src)))>;
def : Pat<(f128 (uint_to_fp (i64 (load iaddrX4:$src)))),		def : Pat<(f128 (uint_to_fp (i64 (load DSForm:$src)))),
(f128 (XSCVUDQP (LXSD iaddrX4:$src)))>;		(f128 (XSCVUDQP (LXSD DSForm:$src)))>;

// Convert Unsigned HWord in memory -> QP		// Convert Unsigned HWord in memory -> QP
def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi16)),		def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi16)),
(f128 (XSCVUDQP (LXSIHZX xaddr:$src)))>;		(f128 (XSCVUDQP (LXSIHZX XForm:$src)))>;

// Convert Unsigned Byte in memory -> QP		// Convert Unsigned Byte in memory -> QP
def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi8)),		def : Pat<(f128 (uint_to_fp ScalarLoads.ZELi8)),
(f128 (XSCVUDQP (LXSIBZX xoaddr:$src)))>;		(f128 (XSCVUDQP (LXSIBZX ForceXForm:$src)))>;

// Truncate & Convert QP -> (Un)Signed (D)Word.		// Truncate & Convert QP -> (Un)Signed (D)Word.
def : Pat<(i64 (any_fp_to_sint f128:$src)), (i64 (MFVRD (XSCVQPSDZ $src)))>;		def : Pat<(i64 (any_fp_to_sint f128:$src)), (i64 (MFVRD (XSCVQPSDZ $src)))>;
def : Pat<(i64 (any_fp_to_uint f128:$src)), (i64 (MFVRD (XSCVQPUDZ $src)))>;		def : Pat<(i64 (any_fp_to_uint f128:$src)), (i64 (MFVRD (XSCVQPUDZ $src)))>;
def : Pat<(i32 (any_fp_to_sint f128:$src)),		def : Pat<(i32 (any_fp_to_sint f128:$src)),
(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC)))>;		(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC)))>;
def : Pat<(i32 (any_fp_to_uint f128:$src)),		def : Pat<(i32 (any_fp_to_uint f128:$src)),
(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC)))>;		(i32 (MFVSRWZ (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC)))>;

// Instructions for store(fptosi).		// Instructions for store(fptosi).
// The 8-byte version is repeated here due to availability of D-Form STXSD.		// The 8-byte version is repeated here due to availability of D-Form STXSD.
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), XForm:$dst, 8),
(STXSDX (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),		(STXSDX (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),
xaddrX4:$dst)>;		XForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), iaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), DSForm:$dst, 8),
(STXSD (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),		(STXSD (COPY_TO_REGCLASS (XSCVQPSDZ f128:$src), VFRC),
iaddrX4:$dst)>;		DSForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xoaddr:$dst, 4),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 4),
(STXSIWX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), xoaddr:$dst)>;		(STXSIWX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xoaddr:$dst, 2),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 2),
(STXSIHX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), xoaddr:$dst)>;		(STXSIHX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), xoaddr:$dst, 1),		(f64 (PPCcv_fp_to_sint_in_vsr f128:$src)), ForceXForm:$dst, 1),
(STXSIBX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), xoaddr:$dst)>;		(STXSIBX (COPY_TO_REGCLASS (XSCVQPSWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), xaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), XForm:$dst, 8),
(STXSDX (XSCVDPSXDS f64:$src), xaddrX4:$dst)>;		(STXSDX (XSCVDPSXDS f64:$src), XForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), iaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), DSForm:$dst, 8),
(STXSD (XSCVDPSXDS f64:$src), iaddrX4:$dst)>;		(STXSD (XSCVDPSXDS f64:$src), DSForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), xoaddr:$dst, 2),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 2),
(STXSIHX (XSCVDPSXWS f64:$src), xoaddr:$dst)>;		(STXSIHX (XSCVDPSXWS f64:$src), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), xoaddr:$dst, 1),		(f64 (PPCcv_fp_to_sint_in_vsr f64:$src)), ForceXForm:$dst, 1),
(STXSIBX (XSCVDPSXWS f64:$src), xoaddr:$dst)>;		(STXSIBX (XSCVDPSXWS f64:$src), ForceXForm:$dst)>;

// Instructions for store(fptoui).		// Instructions for store(fptoui).
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), XForm:$dst, 8),
(STXSDX (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),		(STXSDX (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
xaddrX4:$dst)>;		XForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), iaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), DSForm:$dst, 8),
(STXSD (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),		(STXSD (COPY_TO_REGCLASS (XSCVQPUDZ f128:$src), VFRC),
iaddrX4:$dst)>;		DSForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 4),		(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 4),
(STXSIWX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;		(STXSIWX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 2),		(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 2),
(STXSIHX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;		(STXSIHX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), xoaddr:$dst, 1),		(f64 (PPCcv_fp_to_uint_in_vsr f128:$src)), ForceXForm:$dst, 1),
(STXSIBX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), xoaddr:$dst)>;		(STXSIBX (COPY_TO_REGCLASS (XSCVQPUWZ $src), VFRC), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), xaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), XForm:$dst, 8),
(STXSDX (XSCVDPUXDS f64:$src), xaddrX4:$dst)>;		(STXSDX (XSCVDPUXDS f64:$src), XForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), iaddrX4:$dst, 8),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), DSForm:$dst, 8),
(STXSD (XSCVDPUXDS f64:$src), iaddrX4:$dst)>;		(STXSD (XSCVDPUXDS f64:$src), DSForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), xoaddr:$dst, 2),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 2),
(STXSIHX (XSCVDPUXWS f64:$src), xoaddr:$dst)>;		(STXSIHX (XSCVDPUXWS f64:$src), ForceXForm:$dst)>;
def : Pat<(PPCstore_scal_int_from_vsr		def : Pat<(PPCstore_scal_int_from_vsr
(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), xoaddr:$dst, 1),		(f64 (PPCcv_fp_to_uint_in_vsr f64:$src)), ForceXForm:$dst, 1),
(STXSIBX (XSCVDPUXWS f64:$src), xoaddr:$dst)>;		(STXSIBX (XSCVDPUXWS f64:$src), ForceXForm:$dst)>;

// Round & Convert QP -> DP/SP		// Round & Convert QP -> DP/SP
def : Pat<(f64 (any_fpround f128:$src)), (f64 (XSCVQPDP $src))>;		def : Pat<(f64 (any_fpround f128:$src)), (f64 (XSCVQPDP $src))>;
def : Pat<(f32 (any_fpround f128:$src)), (f32 (XSRSP (XSCVQPDPO $src)))>;		def : Pat<(f32 (any_fpround f128:$src)), (f32 (XSRSP (XSCVQPDPO $src)))>;

// Convert SP -> QP		// Convert SP -> QP
def : Pat<(f128 (any_fpextend f32:$src)),		def : Pat<(f128 (any_fpextend f32:$src)),
(f128 (XSCVDPQP (COPY_TO_REGCLASS $src, VFRC)))>;		(f128 (XSCVDPQP (COPY_TO_REGCLASS $src, VFRC)))>;
Show All 18 Lines	def : Pat<(v16i8 (build_vector immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,
immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,		immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,
immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,		immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,
immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,		immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,
immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,		immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A,
immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A)),		immNonAllOneAnyExt8:$A, immNonAllOneAnyExt8:$A)),
(v16i8 (COPY_TO_REGCLASS (XXSPLTIB imm:$A), VSRC))>;		(v16i8 (COPY_TO_REGCLASS (XXSPLTIB imm:$A), VSRC))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, FltToIntLoad.A,		v4i32, FltToIntLoad.A,
(XVCVSPSXWS (LXVWSX xoaddr:$A)),		(XVCVSPSXWS (LXVWSX ForceXForm:$A)),
(XVCVSPSXWS (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$A), sub_64))>;		(XVCVSPSXWS (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$A), sub_64))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, FltToUIntLoad.A,		v4i32, FltToUIntLoad.A,
(XVCVSPUXWS (LXVWSX xoaddr:$A)),		(XVCVSPUXWS (LXVWSX ForceXForm:$A)),
(XVCVSPUXWS (SUBREG_TO_REG (i64 1), (LIWZX xoaddr:$A), sub_64))>;		(XVCVSPUXWS (SUBREG_TO_REG (i64 1), (LIWZX ForceXForm:$A), sub_64))>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, DblToIntLoadP9.A,		v4i32, DblToIntLoadP9.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWS (DFLOADf64 iaddrX4:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPSXWS (DFLOADf64 DSForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPSXWS (DFLOADf64 iaddrX4:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPSXWS (DFLOADf64 DSForm:$A)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v4i32, DblToUIntLoadP9.A,		v4i32, DblToUIntLoadP9.A,
(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWS (DFLOADf64 iaddrX4:$A)), sub_64), 1),		(XXSPLTW (SUBREG_TO_REG (i64 1), (XSCVDPUXWS (DFLOADf64 DSForm:$A)), sub_64), 1),
(SUBREG_TO_REG (i64 1), (XSCVDPUXWS (DFLOADf64 iaddrX4:$A)), sub_64)>;		(SUBREG_TO_REG (i64 1), (XSCVDPUXWS (DFLOADf64 DSForm:$A)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, FltToLongLoadP9.A,		v2i64, FltToLongLoadP9.A,
(XXPERMDIs (XSCVDPSXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), 0),		(XXPERMDIs (XSCVDPSXDS (COPY_TO_REGCLASS (DFLOADf32 DSForm:$A), VSFRC)), 0),
(SUBREG_TO_REG		(SUBREG_TO_REG
(i64 1),		(i64 1),
(XSCVDPSXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), sub_64)>;		(XSCVDPSXDS (COPY_TO_REGCLASS (DFLOADf32 DSForm:$A), VSFRC)), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, FltToULongLoadP9.A,		v2i64, FltToULongLoadP9.A,
(XXPERMDIs (XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), 0),		(XXPERMDIs (XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 DSForm:$A), VSFRC)), 0),
(SUBREG_TO_REG		(SUBREG_TO_REG
(i64 1),		(i64 1),
(XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), sub_64)>;		(XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 DSForm:$A), VSFRC)), sub_64)>;
def : Pat<(v4f32 (PPCldsplat xoaddr:$A)),		def : Pat<(v4f32 (PPCldsplat ForceXForm:$A)),
(v4f32 (LXVWSX xoaddr:$A))>;		(v4f32 (LXVWSX ForceXForm:$A))>;
def : Pat<(v4i32 (PPCldsplat xoaddr:$A)),		def : Pat<(v4i32 (PPCldsplat ForceXForm:$A)),
(v4i32 (LXVWSX xoaddr:$A))>;		(v4i32 (LXVWSX ForceXForm:$A))>;
} // HasVSX, HasP9Vector		} // HasVSX, HasP9Vector

// Any Power9 VSX subtarget with equivalent length but better Power10 VSX		// Any Power9 VSX subtarget with equivalent length but better Power10 VSX
// patterns.		// patterns.
// Two identical blocks are required due to the slightly different predicates:		// Two identical blocks are required due to the slightly different predicates:
// One without P10 instructions, the other is BigEndian only with P10 instructions.		// One without P10 instructions, the other is BigEndian only with P10 instructions.
let Predicates = [HasVSX, HasP9Vector, NoP10Vector] in {		let Predicates = [HasVSX, HasP9Vector, NoP10Vector] in {
// Little endian Power10 subtargets produce a shorter pattern but require a		// Little endian Power10 subtargets produce a shorter pattern but require a
// COPY_TO_REGCLASS. The COPY_TO_REGCLASS makes it appear to need two instructions		// COPY_TO_REGCLASS. The COPY_TO_REGCLASS makes it appear to need two instructions
// to perform the operation, when only one instruction is produced in practice.		// to perform the operation, when only one instruction is produced in practice.
// The NoP10Vector predicate excludes these patterns from Power10 VSX subtargets.		// The NoP10Vector predicate excludes these patterns from Power10 VSX subtargets.
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v16i8, ScalarLoads.Li8,		v16i8, ScalarLoads.Li8,
(VSPLTBs 7, (LXSIBZX xoaddr:$src)),		(VSPLTBs 7, (LXSIBZX ForceXForm:$src)),
(SUBREG_TO_REG (i64 1), (LXSIBZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIBZX ForceXForm:$src), sub_64)>;
// Build vectors from i16 loads		// Build vectors from i16 loads
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v8i16, ScalarLoads.Li16,		v8i16, ScalarLoads.Li16,
(VSPLTHs 3, (LXSIHZX xoaddr:$src)),		(VSPLTHs 3, (LXSIHZX ForceXForm:$src)),
(SUBREG_TO_REG (i64 1), (LXSIHZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIHZX ForceXForm:$src), sub_64)>;
} // HasVSX, HasP9Vector, NoP10Vector		} // HasVSX, HasP9Vector, NoP10Vector

// Any big endian Power9 VSX subtarget		// Any big endian Power9 VSX subtarget
let Predicates = [HasVSX, HasP9Vector, IsBigEndian] in {		let Predicates = [HasVSX, HasP9Vector, IsBigEndian] in {
// Power10 VSX subtargets produce a shorter pattern for little endian targets		// Power10 VSX subtargets produce a shorter pattern for little endian targets
// but this is still the best pattern for Power9 and Power10 VSX big endian		// but this is still the best pattern for Power9 and Power10 VSX big endian
// Build vectors from i8 loads		// Build vectors from i8 loads
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v16i8, ScalarLoads.Li8,		v16i8, ScalarLoads.Li8,
(VSPLTBs 7, (LXSIBZX xoaddr:$src)),		(VSPLTBs 7, (LXSIBZX ForceXForm:$src)),
(SUBREG_TO_REG (i64 1), (LXSIBZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIBZX ForceXForm:$src), sub_64)>;
// Build vectors from i16 loads		// Build vectors from i16 loads
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v8i16, ScalarLoads.Li16,		v8i16, ScalarLoads.Li16,
(VSPLTHs 3, (LXSIHZX xoaddr:$src)),		(VSPLTHs 3, (LXSIHZX ForceXForm:$src)),
(SUBREG_TO_REG (i64 1), (LXSIHZX xoaddr:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (LXSIHZX ForceXForm:$src), sub_64)>;

def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 0)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 0)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 4)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 4)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 3)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 3)))))),
Show All 19 Lines
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 1)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 1)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 4))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 4))>;
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 2)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 2)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 8))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 8))>;
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 3)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 3)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 12))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 12))>;

// Scalar stores of i8		// Scalar stores of i8
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 0)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 0)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 9)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 9)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 1)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 1)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 2)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 2)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 11)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 11)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 3)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 3)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 4)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 4)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 13)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 13)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 5)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 5)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 6)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 6)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 15)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 15)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 7)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 7)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS $S, VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS $S, VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 8)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 8)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 1)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 1)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 9)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 9)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 10)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 10)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 3)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 3)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 11)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 11)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 12)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 12)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 5)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 5)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 13)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 13)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 14)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 14)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 7)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 7)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 15)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 15)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), ForceXForm:$dst)>;

// Scalar stores of i16		// Scalar stores of i16
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 0)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 0)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 1)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 1)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 2)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 2)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 3)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 3)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS $S, VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS $S, VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 4)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 4)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 5)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 5)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 6)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 6)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 7)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 7)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), ForceXForm:$dst)>;
} // HasVSX, HasP9Vector, IsBigEndian		} // HasVSX, HasP9Vector, IsBigEndian

// Big endian 64Bit Power9 subtarget.		// Big endian 64Bit Power9 subtarget.
let Predicates = [HasVSX, HasP9Vector, IsBigEndian, IsPPC64] in {		let Predicates = [HasVSX, HasP9Vector, IsBigEndian, IsPPC64] in {
def : Pat<(v2i64 (scalar_to_vector (i64 (load iaddrX4:$src)))),		def : Pat<(v2i64 (scalar_to_vector (i64 (load DSForm:$src)))),
(v2i64 (SUBREG_TO_REG (i64 1), (DFLOADf64 iaddrX4:$src), sub_64))>;		(v2i64 (SUBREG_TO_REG (i64 1), (DFLOADf64 DSForm:$src), sub_64))>;
def : Pat<(v2i64 (scalar_to_vector (i64 (load xaddrX4:$src)))),		def : Pat<(v2i64 (scalar_to_vector (i64 (load XForm:$src)))),
(v2i64 (SUBREG_TO_REG (i64 1), (XFLOADf64 xaddrX4:$src), sub_64))>;		(v2i64 (SUBREG_TO_REG (i64 1), (XFLOADf64 XForm:$src), sub_64))>;

def : Pat<(v2f64 (scalar_to_vector (f64 (load iaddrX4:$src)))),		def : Pat<(v2f64 (scalar_to_vector (f64 (load DSForm:$src)))),
(v2f64 (SUBREG_TO_REG (i64 1), (DFLOADf64 iaddrX4:$src), sub_64))>;		(v2f64 (SUBREG_TO_REG (i64 1), (DFLOADf64 DSForm:$src), sub_64))>;
def : Pat<(v2f64 (scalar_to_vector (f64 (load xaddrX4:$src)))),		def : Pat<(v2f64 (scalar_to_vector (f64 (load XForm:$src)))),
(v2f64 (SUBREG_TO_REG (i64 1), (XFLOADf64 xaddrX4:$src), sub_64))>;		(v2f64 (SUBREG_TO_REG (i64 1), (XFLOADf64 XForm:$src), sub_64))>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), xaddrX4:$src)>;		sub_64), XForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), xaddrX4:$src)>;		sub_64), XForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xaddrX4:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), XForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xaddrX4:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), XForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), iaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), iaddrX4:$src)>;		sub_64), DSForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), iaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), iaddrX4:$src)>;		sub_64), DSForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), iaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), iaddrX4:$src)>;		(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), DSForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), iaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), iaddrX4:$src)>;		(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), DSForm:$src)>;

// (Un)Signed DWord vector extract -> QP		// (Un)Signed DWord vector extract -> QP
def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 0)))),		def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 0)))),
(f128 (XSCVSDQP (COPY_TO_REGCLASS $src, VFRC)))>;		(f128 (XSCVSDQP (COPY_TO_REGCLASS $src, VFRC)))>;
def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 1)))),		def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 1)))),
(f128 (XSCVSDQP		(f128 (XSCVSDQP
(EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;		(EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;
def : Pat<(f128 (uint_to_fp (i64 (extractelt v2i64:$src, 0)))),		def : Pat<(f128 (uint_to_fp (i64 (extractelt v2i64:$src, 0)))),
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 0)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 12))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 12))>;
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 1)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 1)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 8))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 8))>;
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 2)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 2)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 4))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 4))>;
def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 3)),		def : Pat<(v4f32 (insertelt v4f32:$A, f32:$B, 3)),
(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 0))>;		(v4f32 (XXINSERTW v4f32:$A, AlignValues.F32_TO_BE_WORD1, 0))>;

def : Pat<(v8i16 (PPCld_vec_be xoaddr:$src)),		def : Pat<(v8i16 (PPCld_vec_be ForceXForm:$src)),
(COPY_TO_REGCLASS (LXVH8X xoaddr:$src), VRRC)>;		(COPY_TO_REGCLASS (LXVH8X ForceXForm:$src), VRRC)>;
def : Pat<(PPCst_vec_be v8i16:$rS, xoaddr:$dst),		def : Pat<(PPCst_vec_be v8i16:$rS, ForceXForm:$dst),
(STXVH8X (COPY_TO_REGCLASS $rS, VSRC), xoaddr:$dst)>;		(STXVH8X (COPY_TO_REGCLASS $rS, VSRC), ForceXForm:$dst)>;

def : Pat<(v16i8 (PPCld_vec_be xoaddr:$src)),		def : Pat<(v16i8 (PPCld_vec_be ForceXForm:$src)),
(COPY_TO_REGCLASS (LXVB16X xoaddr:$src), VRRC)>;		(COPY_TO_REGCLASS (LXVB16X ForceXForm:$src), VRRC)>;
def : Pat<(PPCst_vec_be v16i8:$rS, xoaddr:$dst),		def : Pat<(PPCst_vec_be v16i8:$rS, ForceXForm:$dst),
(STXVB16X (COPY_TO_REGCLASS $rS, VSRC), xoaddr:$dst)>;		(STXVB16X (COPY_TO_REGCLASS $rS, VSRC), ForceXForm:$dst)>;

// Scalar stores of i8		// Scalar stores of i8
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 0)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 0)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 1)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 1)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 7)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 7)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 2)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 2)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 3)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 3)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 5)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 5)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 4)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 4)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 5)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 5)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 3)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 3)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 6)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 6)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 7)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 7)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 1)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 1)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 8)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 8)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS $S, VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS $S, VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 9)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 9)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 15)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 15)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 10)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 10)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 11)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 11)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 13)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 13)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 12)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 12)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 13)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 13)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 11)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 11)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 14)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 14)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 15)), xoaddr:$dst),		def : Pat<(truncstorei8 (i32 (vector_extract v16i8:$S, 15)), ForceXForm:$dst),
(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 9)), VSRC), xoaddr:$dst)>;		(STXSIBXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 9)), VSRC), ForceXForm:$dst)>;

// Scalar stores of i16		// Scalar stores of i16
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 0)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 0)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 8)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 1)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 1)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 6)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 2)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 2)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 4)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 3)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 3)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 2)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 4)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 4)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS $S, VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS $S, VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 5)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 5)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 14)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 6)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 6)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 12)), VSRC), ForceXForm:$dst)>;
def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 7)), xoaddr:$dst),		def : Pat<(truncstorei16 (i32 (vector_extract v8i16:$S, 7)), ForceXForm:$dst),
(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), xoaddr:$dst)>;		(STXSIHXv (COPY_TO_REGCLASS (v16i8 (VSLDOI $S, $S, 10)), VSRC), ForceXForm:$dst)>;

defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, (i64 (load iaddrX4:$src)),		v2i64, (i64 (load DSForm:$src)),
(XXPERMDIs (DFLOADf64 iaddrX4:$src), 2),		(XXPERMDIs (DFLOADf64 DSForm:$src), 2),
(SUBREG_TO_REG (i64 1), (DFLOADf64 iaddrX4:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (DFLOADf64 DSForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2i64, (i64 (load xaddrX4:$src)),		v2i64, (i64 (load XForm:$src)),
(XXPERMDIs (XFLOADf64 xaddrX4:$src), 2),		(XXPERMDIs (XFLOADf64 XForm:$src), 2),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xaddrX4:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 XForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2f64, (f64 (load iaddrX4:$src)),		v2f64, (f64 (load DSForm:$src)),
(XXPERMDIs (DFLOADf64 iaddrX4:$src), 2),		(XXPERMDIs (DFLOADf64 DSForm:$src), 2),
(SUBREG_TO_REG (i64 1), (DFLOADf64 iaddrX4:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (DFLOADf64 DSForm:$src), sub_64)>;
defm : ScalToVecWPermute<		defm : ScalToVecWPermute<
v2f64, (f64 (load xaddrX4:$src)),		v2f64, (f64 (load XForm:$src)),
(XXPERMDIs (XFLOADf64 xaddrX4:$src), 2),		(XXPERMDIs (XFLOADf64 XForm:$src), 2),
(SUBREG_TO_REG (i64 1), (XFLOADf64 xaddrX4:$src), sub_64)>;		(SUBREG_TO_REG (i64 1), (XFLOADf64 XForm:$src), sub_64)>;

def : Pat<(store (i64 (extractelt v2i64:$A, 0)), xaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), xaddrX4:$src)>;		sub_64), XForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), xaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(XFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), xaddrX4:$src)>;		sub_64), XForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), xaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xaddrX4:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), XForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), xaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), XForm:$src),
(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), xaddrX4:$src)>;		(XFSTOREf64 (EXTRACT_SUBREG $A, sub_64), XForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 0)), iaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 0)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),		(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2),
sub_64), iaddrX4:$src)>;		sub_64), DSForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 0)), iaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 0)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),		(DFSTOREf64 (EXTRACT_SUBREG (XXPERMDI $A, $A, 2), sub_64),
iaddrX4:$src)>;		DSForm:$src)>;
def : Pat<(store (i64 (extractelt v2i64:$A, 1)), iaddrX4:$src),		def : Pat<(store (i64 (extractelt v2i64:$A, 1)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), iaddrX4:$src)>;		(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), DSForm:$src)>;
def : Pat<(store (f64 (extractelt v2f64:$A, 1)), iaddrX4:$src),		def : Pat<(store (f64 (extractelt v2f64:$A, 1)), DSForm:$src),
(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), iaddrX4:$src)>;		(DFSTOREf64 (EXTRACT_SUBREG $A, sub_64), DSForm:$src)>;

// (Un)Signed DWord vector extract -> QP		// (Un)Signed DWord vector extract -> QP
def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 0)))),		def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 0)))),
(f128 (XSCVSDQP		(f128 (XSCVSDQP
(EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;		(EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;
def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 1)))),		def : Pat<(f128 (sint_to_fp (i64 (extractelt v2i64:$src, 1)))),
(f128 (XSCVSDQP (COPY_TO_REGCLASS $src, VFRC)))>;		(f128 (XSCVSDQP (COPY_TO_REGCLASS $src, VFRC)))>;
def : Pat<(f128 (uint_to_fp (i64 (extractelt v2i64:$src, 0)))),		def : Pat<(f128 (uint_to_fp (i64 (extractelt v2i64:$src, 0)))),
▲ Show 20 Lines • Show All 432 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/p9-dform-load-alignment.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-unknown \			; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-unknown \
	; RUN: -verify-machineinstrs -ppc-asm-full-reg-names \			; RUN: -verify-machineinstrs -ppc-asm-full-reg-names \
	; RUN: -ppc-vsr-nums-as-vr < %s \| FileCheck %s			; RUN: -ppc-vsr-nums-as-vr < %s \| FileCheck %s

	@best8x8mode = external dso_local local_unnamed_addr global [4 x i16], align 2			@best8x8mode = external dso_local local_unnamed_addr global [4 x i16], align 2
	define dso_local void @AlignDSForm() local_unnamed_addr {			define dso_local void @AlignDSForm() local_unnamed_addr {
	; CHECK-LABEL: AlignDSForm:			; CHECK-LABEL: AlignDSForm:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: addis r3, r2, best8x8mode@toc@ha			; CHECK-NEXT: addis r3, r2, best8x8mode@toc@ha
	; CHECK-NEXT: addi r3, r3, best8x8mode@toc@l			; CHECK-NEXT: addi r3, r3, best8x8mode@toc@l
	; CHECK-NEXT: ldx r3, 0, r3			; CHECK-NEXT: ld r3, 0(r3)
	; CHECK-NEXT: std r3, 0(r3)			; CHECK-NEXT: std r3, 0(r3)
	entry:			entry:
	%0 = load <4 x i16>, <4 x i16>* bitcast ([4 x i16]* @best8x8mode to <4 x i16>*), align 2			%0 = load <4 x i16>, <4 x i16>* bitcast ([4 x i16]* @best8x8mode to <4 x i16>*), align 2
	store <4 x i16> %0, <4 x i16>* undef, align 4			store <4 x i16> %0, <4 x i16>* undef, align 4
	unreachable			unreachable
	}			}