This is an archive of the discontinued LLVM Phabricator instance.

Differential D44092

[SystemZ] Improve side steering of FPd unit and FXU registers.
Needs ReviewPublic

Authored by jonpa on Mar 5 2018, 7:46 AM.

Download Raw Diff

Details

Reviewers

uweigand

Summary

Add side-steering of FXU registers.
Improve and implement a general side steering utility.

Add side-steering of FXU registers.

Candidate gets a new BypassCost member.

bypassCost() computes the BypassCost for Candidate. It tries to match the FXU register uses with the previous def of the same register. It also tries not to place a def in the last slot of a decoder group if it has a successor which is only waiting for that reg.

Improve the side steering and also get better results for FPd unit.

On trunk, the only side-steering heuristic is to check for the exact distance of 3, and only then be sure that two FPd instructions end up on oppsite processor sides. This is the very simplest version of side-steering, and can be improved.

Given that the function starts with a taken branch, it should always be possible to know which *possible* groupings each basic block start with. If a block has multiple predecessors and one falls-through into current block, an alternate grouping is possible (unless the linear predecessor ends with a complete group).

Branch probabilites, or defs in previous blocks are ignored. The only knowledge that is added at the beginning of a block compared to trunk are the possible alternate decoder groupings, which enterMBB() is extended to implement.

These are modelled by GroupOffsets. GroupOffset[0] is always true, since this is simply the current scheduler state regardless of any alternate groupings. If GroupOffset[1] is true, this means that there are possibly already one instruction in current decoder group, and similarly for GroupOffset[2].

There are now three alternative situations:

No group offsets. Side steering means looking at the cycle indexes of the two SUs, and directly comparing if they are both high or low. Indexes 0-2 are low and 3-5 are high, or "left" / "right" processor sides.
One offset. This means that there are smaller groups (of two slots) that are true in both alternatives which are checked instead of the full groups.
Two offsets. Group limits could be anywhere, so only the distance-of-3 heuristic is sure to work.

SystemZHazardRecognizer is extended with

SideSteerIndexes: A map that records the decoder cycle index at the point of emitting an SU, for the relevant side steering resource, e.g. the FPd unit, or a defined FXU register.
checkSide(): Implements points 1-3 above, either to check for the same or opposite side.
Some extra care has to be taken when emitting a non-taken branch, or when a block has multiple predecessors. If the there are then any group offsets, they can and must be recomputed. See emitInstruction() and normalize().

Evaluation on SPEC:

It is interesting to note that 88% of the SUs at the point they are emitted are in a state without any grouping offsets. 8% have offset:1, 3% have offset:2, and 0.75% have both offset 1 and 2. It seems therefore as a strong alternative to simply always ignore the alternate groupings. This would simplify the patch greatly, as most of the complexity lies in keeping track of the offsets. This also seems to work about as well on preliminary runs.

Using this patch seems to give perhaps 0.2-0.3 % improvements on average over benchmarks.

It is curious to note that the bypassing heuristic makes the scheduler more often run out of alternative SUs. This seems to mean that there are a few more instances of when a cracked instruction breaks a group early etc. The more aggressive the FXU heuristic is, the more this happens, although it is quite marginal to begin with. See attached table:

C is master (unmodified). E is just improved FPd side steering. G, I, K and M are as well using FXU side steering with different cuttofs of the height (M has no cutoff -> most aggressive). The columns show how much a more aggressive FXU side steering influences the other statistics. E shows some improved FPd scheduling. BypassCost has 'Known' values -2 (good), 1 and 2 (bad). The "rest" are all the cases where the scheduler "does not know". Similarly for GroupingCosts.

Compile time:

Since the noCost() method now looks for -2 bypass cost, many more (x7 !) candidates are evaluated:

                                                                                master                lim5     lim10000
Number of sched candidates evaluated:       272177         2077046      2201446

This seems also somewhat indicated using --time-passes. Average post-RA scheduler pass percentage of compile time:

master
User 1.39%   | System 1.26%  |  User+Sys 1.4%  | Wall 1.49%
User 1.21%   | System 1.07%  |  User+Sys 1.19%  | Wall 1.48%

lim5
User 1.46%   | System 1.28%  |  User+Sys 1.47%  | Wall 1.69%
User 1.4%    | System 1.21%  |  User+Sys 1.39%  | Wall 1.66%

lim10000
User 1.38%  | System 1.19%  |  User+Sys 1.36%  | Wall 1.65%
User 1.4%   | System 1.10%  |  User+Sys 1.38%  | Wall 1.65%

This is not that much, and if it is an issue it can probably be improved further.

Experimental options:

SIDESTEERING_FXU: enables the FXU side steering. Without it, only FPd side steering is affected.

FXU_HEIGHTDIFF: Sets a cutoff as to when to stop looking for an FXU bypass in the Available set. If Best is this much higher than the last tried candidate, it is accepted without a bypass. This adjusts the aggressiveness of the bypass heurstic.

DOGROUPS: Always do the groups as if there is no alternative groupings.
NOSIDESTEERRESET: Don't reset side-steering. If used with DOGROUPS, it then gives the behaviour of "ignoring groups" (simplified version of patch).

Diff Detail

Event Timeline

jonpa created this revision.Mar 5 2018, 7:46 AM

Herald added subscribers: javed.absar, MatzeB. · View Herald TranscriptMar 5 2018, 7:46 AM

This is the table mentioned in the Summary.

This patch has been improved to make use of B2B information. B2BW, B2BR, and B2BRW FUs have been added to the SchedModel so that instructions can be modeled to use these. B2BRW is not really needed, but I tried using it for readability. This is one way of keeping track of which instructions can read and/or write B2B - a disadvantage is that the enum for the ProcResources is not available from TableGen so that has been added locally instead for now. It looked like there was probably enough irregularity among the opcodes to motivate this approach - although the differences between subtargets were very small.

One open question is what happens if one instruction defines a high register on one side, and another instruction defines the low part on the other side? Should these subregs be tracked separately, or is it always the full (64-bit) reg that is written? (What about 128bit?)

I revisited the question from before about whether the assumption that the first instruction in the MBB really begins a new decoder group or not. A naive estimate can be made by categorizing the type of incoming edges and their relative frequencies (this ignores probabilites):

In these cases, the assumption is correct (the MBBs are scheduled in linear order, so a not scheduled pred will also be taking a branch):

Multiple predecessors, incoming Taken Branch: 29%
Multiple predecessors, block not scheduled: 10%
Single predecessor, sched-state known, taken branch: 9%
Multiple predecessors, linear pred ends group: 8%
Entry-blocks: 3%
Single predecessor, not scheduled : 2%

These edges (blocks) simply continue as before, right or wrong:

Single predecessor, sched-state known, linear pred: 31%

These edges mean that the scheduler is wrong:

Multiple predecessors, linear pred has 1 in group: 6%
Multiple predecessors, linear pred has 2 in group: 2%

In summary,
61% of the incoming edges known to lead to correct scheduling.
8% of the incoming edges are known to lead to an unmodelled group offset in the scheduler.
31% are continuing from before, which means that they should not change the ratio, which then is actually 88% vs 12%.

As soon as a cracked instruction is scheduled, the scheduler is right again after that point, so the above might be seen as a bit pessimistic.

So it seems that in 9 out of 10 times the scheduler is right in this assumption generally, even though this does not take into account the actual hotness of the edges. And even if the grouping is off, there is still a chance for the bypass if scheduled next to each other:

[x _ x]  90%
[_ x x]  97%
[x x _]  93%
[x _ _][_ _ _]
[x _ _] 100%

The next question is how effective the scheduler is in actually producing a schedule that puts B2B reads on the same side as their B2B writes under the assumption that it can track the current decoder slot. Without any particular heuristic, this should by chance be 50/50 - with a random schedule half of the reads end up on the right side.

Without the the B2B side-steering enabled during node selection:

B2B reads with good schedule: ~8%   (58% ratio)
B2B reads with bad schedule : ~6%

With side-steering of B2B reads only (-sidesteer-fxu):

B2B reads with good schedule: ~9%   (71% ratio)
B2B reads with bad schedule : ~4%

With side-steering of B2B reads and writes (-sidesteer-fxu -sidesteer-lastslot):

B2B reads with good schedule: ~10%  (75% ratio)
B2B reads with bad schedule : ~3%

This shows an improvement with a higher ratio of good B2BR scheduled nodes.

The bypass heuristic is used with a lower priority than grouping or resources, but those costs were present in only of 2% of the cases of a bypass cost.

When any B2B (write or read) cost was scheduled, there were generally not many nodes available to choose from:

1 available: 50%
2 available: 27%
3 available: 11%
6 or more available: 3%

For the B2BW nodes which did not get handled, in 92% of the cases it was the only node available.
For the B2BR nodes which ended up on wrong side, 88% of them where the only node available.

So it seems that the potential of this patch is limited by the fact that the starting point is not "0%", but rather around 50%, and also because the availability of alternate nodes is typically low.

With -sidesteer-exact, only scheduling in a following decoder group on the same slot (modulo 6 instructions) is aimed for which would be immune to incorrect tracking of decoder groups (linear predecessor fall-through).
This gave much fewer known beneficially scheduled reads, which should be due to the low number of available instructions.

Possible improvements / ideas:

If an instruction uses two registers both defined with a B2BW, one could try to put both definitions on same side.
It was a good while since I checked but it may be worth looking into "breaking anti-dependencies" before post-ra sched, to perhaps make more instructions available.

Benchmarks:

I compared master to "-sidesteer-fxu -sidesteer-lastslot" (1), "-sidesteer-fxu -sidesteer-exact" (2), and "-sidesteer-fxu -newfpd-sides" (3).

This gave small mixed results during the first run of SPEC-17. A few benchmarks were then rerun in "full" mode and it seemed that out of these namd and xalancbmk improved ~1%. Namd is not an integer benchmark, but maybe this was related to some induction variable in some loop?

Gave in the "full" run 2% improvement on xalancbmk, but also 2% regression on omnetpp, and 1% regression on lbm.

This used the tracked groups for FPd ops, but this did not seem to improve benchmarks this time around either.

Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2021, 11:56 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

Latest improvements - still with ongoing experiments.

-Instead of keeping side of the last cycle index of defined GPRs, just remember if the B2BW SU went left or right.

B2BEdge() looks on the DAG and determines if the two instructions are B2B.

Currently trying two different approaches: "heuristical", or "look-ahead"..."Look-ahead" should hopefully eventually provide a near-optimal result, while maybe the heuristical could get close and simpler...

Patch updated with latest improvements (still experimental).

I tried refining the look-ahead techniques as I were looking into single cases that could be improved relative to the heuristic approach, which however typically seemed to fix a particular issue while making overall results worse:

check against the number of other available instructions in order to let "next" side begin on the "current" side when it had to (as well as letting the current side be evaluated into the next side if it did not fit in current).
handlings so that the "other pred" which is also available as a separate candidate did not get scheduled if the algorithm had decided that the optimal sequence was SU, "other pred", SU.

The lookahead algorithm could handle tricky overlapping cases with cost functions, but it got too complicated with the 3-instruction case ("other-pred"). This is rather simple to use as a heuristic: "If SU has a B2B user that is also waiting for another predecessor, schedule all three of them if on first decoder slot. But making that work in a more comprehensive context was not at all as simple...

This version shows the lookahead algorithm that gave the best numbers (see below), which was without the two points above. Overall, it was just slightly better than the heuristic approach. The first section below lists the number of instructions that had a particular number of Good/Bad B2BR operands as they were scheduled. For example, 138k instructions with a single B2BR operand where now scheduled on the right side. There were fewer instructions with both operands on the bad side - some of them got both on the right side, some became 1/1.

master <> heuristical approach
Good: 0  Bad 1   264532   125894  -138638
Good: 0  Bad 2    12455     6691    -5764
Good: 1  Bad 0   506788   645426   138638
Good: 1  Bad 1     9410    11460     2050
Good: 1  Bad 2       71      101       30
Good: 2  Bad 0     7617    11331     3714
Good: 2  Bad 1       52       51       -1

Sum edges LHS: Good:  531626 Bad:  299163
Sum edges RHS: Good:  679755 Bad:  151034
                     +148129      -148129

heuristical <> look-ahead
Good: 0  Bad 1   125894   121273    -4621
Good: 0  Bad 2     6691     7295      604
Good: 1  Bad 0   645426   650047     4621
Good: 1  Bad 1    11460     9803    -1657
Good: 1  Bad 2      101      104        3
Good: 2  Bad 0    11331    12384     1053
Good: 2  Bad 1       51       50       -1
Good: 2  Bad 2        0        1        1

Sum edges LHS: Good:  679755 Bad:  151034
Sum edges RHS: Good:  684827 Bad:  145962
		       +5072        -5072

The look-ahead took a few more instructions to the right side.

I started to simplify the patch and handled one minor regression and then realized something... (see below :-)

Patch simplified towards getting ready to commit:

The look-ahead algorithm removed.
Minor regression on imagick handled by having a "height cutoff": I figured that the only way this patch could make things worse was by pulling a lot of lower nodes up in the schedule so much that it would slow down the critical path. I found that by using a height limit cutoff that would take the higher node regardless of the B2B cost, I could get nearly all of the lbm improvement while also eliminating the imagick (2% slowdown). With the best value (-2), I now see +100k more "Good" edges instead of +147k, but benchmarking seemed to say this was better even though the total improvement number is less.

B2BWrites: The "OtherOpSides" handling checked if the B2BR of an *unscheduled* B2BW has *another* B2BR operand whose definition is already scheduled on the other ("wrong") side. If so, a positive cost was returned in the hope of scheduling the B2BW later on the other side. This gave merely <500 more "Good" operands. 96% of the B2B readers have only one B2B operand, and the increase with this handling gives only an extra 0.3% improvement, so this was removed-

The "OtherPred" handling is meant to handle the case where an available (unscheduled) B2BW has a B2BR successor, but the B2BR also has one other predecessor unscheduled. This gave ~7k more "Good" operands (without the height-cutoff). This is still here, but it could perhaps be removed. The "triangle" cases (where that other B2BR predecessor is the B2BW SU itself) did not seem to give any benefit on benchmarks (about 900 more "Good" instructions), so it was removed for sake of simplicity.

At this point I was happy to have a 10% improvement on LBM, whith an eliminated slowdown of imagick I had seen for a while. There were actually 1-2 other 2-3% improvements as well with the reduced benchmarking, but with -F there was only LBM.

Since only one single benchmark improved notably, I wanted to see if this was really B2B related or not - LBM contains many FP divides, and it might also be related to grouping (different schedules might make a cracked instruction available on the right slot etc...)

Experiments showed that rescheduling MBB3 *alone* in lbm.s (LBM_performStreamCollide_TRT) with this patch applied gives the 8-9% speedup! And it did indeed seem to be due to B2B side steering: With patch, the LA:R9 and 5 of its 6 users are on the same side, and all the AGFI:s are on the right side, which seems to be near ideal: the data-flow happening on the same side:

patched:

        jhe     .LBB8_8             // NT branch on first slot
.LBB8_3:                                # %for.body
					# =>This Inner Loop Header: Depth=1
	la      %r9, 0(%r1,%r3)
	l       %r0, 152(%r1,%r2)
     ld      %f1, 0(%r1,%r2)
     pfd     1, 1432(%r1,%r2)
     pfd     2, 1280(%r1,%r3)
	lgr     %r5, %r9            // R9 B2B
	lgr     %r8, %r9
	lgr     %r7, %r9
     pfd     2, -14704(%r1,%r3)
     pfd     2, 17288(%r1,%r3)
     lgr     %r6, %r9              // R9 read on wrong side just once
	lgr     %r4, %r9           // R9 B2B
	lgr     %r10, %r9
	agfi    %r5, 1617368
     agfi    %r6, -1614608         // R6 B2B
     pfd     2, 0(%r6)
     pfd     2, 0(%r5)
	agfi    %r8, -1582624      // R8, R7, R4 B2B
	agfi    %r7, 1585384
	agfi    %r4, 1601320
     pfd     2, 0(%r4)
     pfd     2, 0(%r7)
     pfd     2, 0(%r8)
	agfi    %r10, -1598672     // R10 B2B
	tmll    %r0, 1
	pfd     2, 0(%r10)
     jne     .LBB8_1

unpatched:

        jhe     .LBB8_8             // NT branch on first slot
LBB8_3:                                # %for.body
					# =>This Inner Loop Header: Depth=1
	l       %r0, 152(%r1,%r2)
	ld      %f1, 0(%r1,%r2)
      la      %r9, 0(%r1,%r3)
      lgr     %r5, %r9
      lgr     %r8, %r9
	lgr     %r7, %r9          // 3 R9 reads on wrong side
	lgr     %r6, %r9
	lgr     %r4, %r9
      lgr     %r10, %r9
      agfi    %r5, 1617368
      agfi    %r8, -1582624
	agfi    %r7, 1585384
	agfi    %r6, -1614608
	pfd     1, 1432(%r1,%r2)
      agfi    %r4, 1601320        // R4 wrong side
      pfd     2, 1280(%r1,%r3)
      pfd     2, -14704(%r1,%r3)
	agfi    %r10, -1598672    // R10 wrong side
	tmll    %r0, 1
	pfd     2, 17288(%r1,%r3)
      pfd     2, 0(%r10)
      pfd     2, 0(%r4)
      pfd     2, 0(%r6)
	pfd     2, 0(%r7)
	pfd     2, 0(%r8)
	pfd     2, 0(%r5)
       jne     .LBB8_1

The trunk (unpatched) schedule is not entirely bad since it randomly puts a several reads B2B. It seems that in this case with many LGRs from the same LA, which in turn are used by AGFIs, it is clearly beneficial to have all that happening on one side. Since this involves a lot of register moves, maybe the z15 machine has a better register renaming support, which would explain why there is no further benchmark improvement on that machine?

So this was not due to FP divides, or cracked instructions -it was seemingly quite likely benefiting from the side-steering, which is what I wanted to see. BUT...

That code itself is only intended to compute the addresses for the prefetches, and it is far from optimal. It is a single LA, with multiple LGRs and AGFIs then used by PFDs without any displacement. So I decided to try and hand code that into better assembly:

jhe     .LBB8_8
.LBB8_3:                                # %for.body
					# =>This Inner Loop Header: Depth=1
	l       %r0, 152(%r1,%r2)
	ld      %f1, 0(%r1,%r2)
	la      %r9, 0(%r1,%r3)     // r9 is live out
	la      %r7, 0(%r1,%r3)     // the other regs (r4-r8 seemed to be local only)
	la      %r8, 0(%r1,%r3)     // (la seems just as fast as lgr per table...)
	agfi    %r7, 1585384
	agfi    %r8, -1614608
	pfd     1, 1432(%r1,%r2)
	pfd     2, 1280(%r1,%r3)
	pfd     2, -14704(%r1,%r3)
	tmll    %r0, 1
	pfd     2, 17288(%r1,%r3)
	pfd     2, 15936(%r8)
	pfd     2, 15936(%r7)
	pfd     2, 0(%r8)
	pfd     2, 0(%r7)
	pfd     2, 31984(%r7)
	pfd     2, 31984(%r8)

Would doing this alone *without* this B2B patch provide the same benefit? Yes! And even more than before: 12%!!

I reran LBM with -F:

clang unpatched                    : 335s
Handcoded ber above                : 295s   88%
GCC (ffp-contract=off)             : 362s  108%
clang, disabling extra prefetching : 479s  143%

It looks like clang benefited heavily (and still does) from the increased prefetching, but there was still a missed further improvement to be made in regards to the generation of those addresses for the prefetches.

It seems to me now then that the address computations should first be handled and then possibly this patch could be revisited again (I tried putting the two LAs/AGFIs I used on the *wrong* sides, and it did not seem to matter...).

Revision Contents

Path

Size

llvm/

lib/

Target/

SystemZ/

SystemZHazardRecognizer.h

63 lines

SystemZHazardRecognizer.cpp

259 lines

SystemZMachineScheduler.h

16 lines

SystemZMachineScheduler.cpp

41 lines

SystemZSchedule.td

4 lines

SystemZScheduleZ14.td

354 lines

SystemZScheduleZ15.td

368 lines

test/

CodeGen/

SystemZ/

postra-sched-sidesteer.mir

109 lines

Diff 322006

llvm/lib/Target/SystemZ/SystemZHazardRecognizer.h

Show All 39 Lines
#include <string>		#include <string>

namespace llvm {		namespace llvm {

/// SystemZHazardRecognizer maintains the state for one MBB during scheduling.		/// SystemZHazardRecognizer maintains the state for one MBB during scheduling.
class SystemZHazardRecognizer : public ScheduleHazardRecognizer {		class SystemZHazardRecognizer : public ScheduleHazardRecognizer {

const SystemZInstrInfo *TII;		const SystemZInstrInfo *TII;
		std::string getSideSteerResourceName(unsigned ID) const;
		const TargetRegisterInfo *TRI;
const TargetSchedModel *SchedModel;		const TargetSchedModel *SchedModel;

/// Keep track of the number of decoder slots used in the current		/// Keep track of the number of decoder slots used in the current
/// decoder group.		/// decoder group.
unsigned CurrGroupSize;		unsigned CurrGroupSize;

/// True if an instruction with four reg operands have been scheduled into		/// True if an instruction with four reg operands have been scheduled into
/// the current decoder group.		/// the current decoder group.
Show All 27 Lines	class SystemZHazardRecognizer : public ScheduleHazardRecognizer {
/// is passed which will begin a new decoder group, the returned value is		/// is passed which will begin a new decoder group, the returned value is
/// the cycle index of the next group.		/// the cycle index of the next group.
unsigned getCurrCycleIdx(SUnit *SU = nullptr) const;		unsigned getCurrCycleIdx(SUnit *SU = nullptr) const;

/// LastFPdOpCycleIdx stores the numbeer returned by getCurrCycleIdx()		/// LastFPdOpCycleIdx stores the numbeer returned by getCurrCycleIdx()
/// when a stalling operation is scheduled (which uses the FPd resource).		/// when a stalling operation is scheduled (which uses the FPd resource).
unsigned LastFPdOpCycleIdx;		unsigned LastFPdOpCycleIdx;

		// There is no enum list generated for the processor resources by tablegen,
		// so make one here for the needed ones.
		enum ProcRes { B2BRW = 1, B2BRn = 2, B2BnW = 3, LSU = 6 };
		bool usesProcRes(SUnit *SU, ProcRes ProcResIdx) const;
		bool readsB2BVector(SUnit *SU) const {
		return usesProcRes(SU, ProcRes::B2BRW) \|\| usesProcRes(SU, ProcRes::B2BRn);
		}
		bool writesB2BVector(SUnit *SU) const {
		return usesProcRes(SU, ProcRes::B2BRW) \|\| usesProcRes(SU, ProcRes::B2BnW);
		}
		bool isB2BOpIdx(SUnit *SU, unsigned Idx, bool Def) const;
		bool isB2BOpReg(SUnit *SU, Register Reg, bool Def) const;
		bool isB2BEdge(SUnit SU, SUnit SuccSU, Register Reg) const;
		bool hasB2BEdge(SUnit SU, SUnit SuccSU) const;
		bool isCountedProcResource(unsigned PIdx) const {
		return !(PIdx == ProcRes::B2BRW \|\| PIdx == ProcRes::B2BRn \|\|
		PIdx == ProcRes::B2BnW);
		}

		std::map<SUnit *, bool> B2BWSides;
		bool willGoLeft(SUnit *SU) const { return getCurrCycleIdx(SU) < 3; }
		bool wasScheduledLeft(SUnit *SU) const {
		std::map<SUnit *, bool>::const_iterator I = B2BWSides.find(SU);
		assert(I != B2BWSides.end() && "SU not scheduled?");
		return I->second;
		}
		bool isAvailable(SUnit SU, SUnit EvalSU = nullptr) const;
		bool checkB2BUser(SUnit Candidate, SUnit User, Register Reg,
		SUnit *&OtherPred) const;
		struct B2BWriteMetrics {
		unsigned NumReaders;
		unsigned NumReadersWithMiddlePred;
		B2BWriteMetrics() : NumReaders(0), NumReadersWithMiddlePred(0) {}
		};

		int analyzeB2BWrite(B2BWriteMetrics &M, unsigned CurrSlot);
		void B2BWriteCost(SUnit *SU, B2BWriteMetrics &M) const;

/// A counter of decoder groups scheduled.		/// A counter of decoder groups scheduled.
unsigned GrpCount;		unsigned GrpCount;

unsigned getCurrGroupSize() {return CurrGroupSize;};		unsigned getCurrGroupSize() {return CurrGroupSize;};

/// Start next decoder group.		/// Start next decoder group.
void nextGroup();		void nextGroup();

/// Clear all counters for processor resources.		/// Clear all counters for processor resources.
void clearProcResCounters();		void clearProcResCounters();

/// With the goal of alternating processor sides for stalling (FPd)		/// With the goal of alternating processor sides for stalling (FPd)
/// ops, return true if it seems good to schedule an FPd op next.		/// ops, return true if it seems good to schedule an FPd op next.
bool isFPdOpPreferred_distance(SUnit *SU) const;		bool isFPdOpPreferred_distance(SUnit *SU) const;

/// Last emitted instruction or nullptr.		/// Last emitted instruction or nullptr.
MachineInstr *LastEmittedMI;		MachineInstr *LastEmittedMI;

public:		public:
SystemZHazardRecognizer(const SystemZInstrInfo *tii,		SystemZHazardRecognizer(const SystemZInstrInfo *tii,
		const TargetRegisterInfo *tri,
const TargetSchedModel *SM)		const TargetSchedModel *SM)
: TII(tii), SchedModel(SM) {		: TII(tii), TRI(tri), SchedModel(SM) {
Reset();		Reset();

		#ifndef NDEBUG
		const MCProcResourceDesc *PRD = SchedModel->getProcResource(ProcRes::B2BRW);
		assert(std::string(PRD->Name).find("B2BRW") != std::string::npos &&
		"Bad ProcRes enum mapping");
		PRD = SchedModel->getProcResource(ProcRes::B2BRn);
		assert(std::string(PRD->Name).find("B2BRn") != std::string::npos &&
		"Bad ProcRes enum mapping");
		PRD = SchedModel->getProcResource(ProcRes::B2BnW);
		assert(std::string(PRD->Name).find("B2BnW") != std::string::npos &&
		"Bad ProcRes enum mapping");
		PRD = SchedModel->getProcResource(ProcRes::LSU);
		assert(std::string(PRD->Name).find("LSU") != std::string::npos &&
		"Bad ProcRes enum mapping");
		#endif
}		}

HazardType getHazardType(SUnit *SU, int Stalls = 0) override;		HazardType getHazardType(SUnit *SU, int Stalls = 0) override;
void Reset() override;		void Reset() override;
void EmitInstruction(SUnit *SU) override;		void EmitInstruction(SUnit *SU) override;

/// Resolves and cache a resolved scheduling class for an SUnit.		/// Resolves and cache a resolved scheduling class for an SUnit.
const MCSchedClassDesc getSchedClass(SUnit SU) const {		const MCSchedClassDesc getSchedClass(SUnit SU) const {
Show All 14 Lines	#endif
/// instructions this returns 0.		/// instructions this returns 0.
int groupingCost(SUnit *SU) const;		int groupingCost(SUnit *SU) const;

/// Return the cost of SU in regards to processor resources usage.		/// Return the cost of SU in regards to processor resources usage.
/// A positive value means it would be better to wait with SU, while		/// A positive value means it would be better to wait with SU, while
/// a negative value means it would be good to schedule SU next.		/// a negative value means it would be good to schedule SU next.
int resourcesCost(SUnit *SU);		int resourcesCost(SUnit *SU);

		int B2BCost(SUnit *SU);
		void B2BReadCost(SUnit *SU, unsigned &NumGood, unsigned &NumBad,
		bool FlipSide = false) const;
		void clearSUMappings() { B2BWSides.clear(); }

#ifndef NDEBUG		#ifndef NDEBUG
// Debug dumping.		// Debug dumping.
std::string CurGroupDbg; // current group as text		std::string CurGroupDbg; // current group as text
void dumpSU(SUnit *SU, raw_ostream &OS) const;		void dumpSU(SUnit *SU, raw_ostream &OS) const;
void dumpCurrGroup(std::string Msg = "") const;		void dumpCurrGroup(std::string Msg = "") const;
void dumpProcResourceCounters() const;		void dumpProcResourceCounters() const;
void dumpState() const;		void dumpState() const;
#endif		#endif
Show All 10 Lines

llvm/lib/Target/SystemZ/SystemZHazardRecognizer.cpp

Show All 36 Lines
// This is the limit of processor resource usage at which the		// This is the limit of processor resource usage at which the
// scheduler should try to look for other instructions (not using the		// scheduler should try to look for other instructions (not using the
// critical resource).		// critical resource).
static cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,		static cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,
cl::desc("The OOO window for processor "		cl::desc("The OOO window for processor "
"resources during scheduling."),		"resources during scheduling."),
cl::init(8));		cl::init(8));

		// EXPERIMENTAL
		#include "llvm/Support/CommandLine.h"
		static cl::opt<bool> DUMP_B2B("dump-b2b", cl::init(false));

unsigned SystemZHazardRecognizer::		unsigned SystemZHazardRecognizer::
getNumDecoderSlots(SUnit *SU) const {		getNumDecoderSlots(SUnit *SU) const {
const MCSchedClassDesc *SC = getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return 0; // IMPLICIT_DEF / KILL -- will not make impact in output.		return 0; // IMPLICIT_DEF / KILL -- will not make impact in output.

assert((SC->NumMicroOps != 2 \|\| (SC->BeginGroup && !SC->EndGroup)) &&		assert((SC->NumMicroOps != 2 \|\| (SC->BeginGroup && !SC->EndGroup)) &&
"Only cracked instruction can have 2 uops.");		"Only cracked instruction can have 2 uops.");
Show All 15 Lines	if (Idx == 1 \|\| Idx == 2)
Idx = 3;		Idx = 3;
else if (Idx == 4 \|\| Idx == 5)		else if (Idx == 4 \|\| Idx == 5)
Idx = 0;		Idx = 0;
}		}

return Idx;		return Idx;
}		}

		static bool isFXUReg(Register Reg) {
		return (SystemZ::GRX32BitRegClass.contains(Reg) \|\|
		SystemZ::GR64BitRegClass.contains(Reg) \|\|
		SystemZ::GR128BitRegClass.contains(Reg));
		}

		static bool isFXUReg(const MachineOperand &MO) { return isFXUReg(MO.getReg()); }

		bool SystemZHazardRecognizer::usesProcRes(SUnit *SU, ProcRes ProcResIdx) const {
		const MCSchedClassDesc *SC = getSchedClass(SU);
		if (!SC->isValid())
		return false;
		for (TargetSchedModel::ProcResIter
		PI = SchedModel->getWriteProcResBegin(SC),
		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)
		if (PI->ProcResourceIdx == ProcResIdx)
		return true;
		return false;
		}

		bool SystemZHazardRecognizer::
		isB2BOpIdx(SUnit *SU, unsigned Idx, bool Def) const {
		MachineInstr *MI = SU->getInstr();
		// A memory operand handled by the LSU is not back-to-back.
		const MCInstrDesc &MCID = MI->getDesc();
		if (Idx >= MCID.getNumOperands())
		return false;
		if (MCID.OpInfo[Idx].OperandType == MCOI::OPERAND_MEMORY &&
		usesProcRes(SU, ProcRes::LSU))
		return false;

		const MachineOperand &MO = MI->getOperand(Idx);
		if (!MO.isReg() \|\| !MO.getReg() \|\| MO.isImplicit() \|\| Def != MO.isDef() \|\|
		!isFXUReg(MO))
		return false;
		return ((!Def && readsB2BVector(SU)) \|\| (Def && writesB2BVector(SU)));
		}

		bool SystemZHazardRecognizer::
		isB2BOpReg(SUnit *SU, Register Reg, bool Def) const {
		MachineInstr *MI = SU->getInstr();
		for (unsigned Idx = 0; Idx < MI->getNumExplicitOperands(); ++Idx) {
		MachineOperand &MO = MI->getOperand(Idx);
		if (MO.isReg() &&
		((Def && TRI->regsOverlap(MO.getReg(), Reg)) \|\|
		(!Def && MO.getReg() == Reg)) &&
		isB2BOpIdx(SU, Idx, Def))
		return true;
		}
		return false;
		}

		bool SystemZHazardRecognizer::
		isB2BEdge(SUnit SU, SUnit SuccSU, Register Reg) const {
		// Check the difference in height in order to ignore cases which are not on the CP.
		// TODO: Try to check reachablility instead? Then a slightly lower but
		// independent user could also perhaps be B2B...?
		unsigned HeightDiff = SU->getHeight() - SuccSU->getHeight();
		return (HeightDiff <= SU->Latency && // Might be less due to read-advance.
		isB2BOpReg(SU, Reg, true/Def/) &&
		isB2BOpReg(SuccSU, Reg, false/Def/));
		}

		bool SystemZHazardRecognizer::hasB2BEdge(SUnit SU, SUnit SuccSU) const {
		for (SDep &SuccDep : SU->Succs)
		if (SuccDep.getKind() == SDep::Data && SuccDep.getSUnit() == SuccSU &&
		isB2BEdge(SU, SuccSU, SuccDep.getReg()))
		return true;
		return false;
		}

		bool SystemZHazardRecognizer::isAvailable(SUnit SU, SUnit EvalSU) const {
		for (SDep &Pred : SU->Preds) {
		SUnit *P = Pred.getSUnit();
		if (!P->isBoundaryNode() && !P->isScheduled && P != EvalSU)
		return false;
		}
		return true;
		}

		bool SystemZHazardRecognizer::
		checkB2BUser(SUnit Candidate, SUnit User, Register Reg,
		SUnit *&OtherPred) const {
		if (getSchedClass(User)->BeginGroup)
		return false;

		OtherPred = nullptr;
		for (SDep &Pred : User->Preds) {
		SUnit *P = Pred.getSUnit();
		if (P->isBoundaryNode() \|\| P == Candidate \|\| P->isScheduled)
		continue;

		// B2BW Candidate
		// /\ /\ P2
		// \ \
		// \ OtherPred Scheduled Pred
		// P \ /\ P .....: P
		// \ / .......:
		// B2BR User...:
		//
		// => [Candidate (OtherPred) User] (decoder group)
		// Allow one other unscheduled predecessor than Candidate if it is
		// available or only waiting for Candidate.
		if (OtherPred) {
		if (P != OtherPred)
		return false;
		continue;
		}
		if (getSchedClass(P)->NumMicroOps > 1)
		return false;
		OtherPred = P;
		}

		return true;
		}

		void SystemZHazardRecognizer::B2BWriteCost(SUnit *SU, B2BWriteMetrics &M) const {
		// Give an immediate user a chance into the same decoder group.
		SmallPtrSet<SUnit *, 8> Seen;
		for (SDep &SuccDep : SU->Succs) {
		SUnit *SuccSU = SuccDep.getSUnit();
		if (!SuccSU->isBoundaryNode() && SuccDep.getKind() == SDep::Data &&
		!Seen.count(SuccSU) && isB2BEdge(SU, SuccSU, SuccDep.getReg())) {
		SUnit *OtherPred = nullptr;
		if (!checkB2BUser(SU, SuccSU, SuccDep.getReg(), OtherPred))
		continue;
		Seen.insert(SuccSU);

		if (OtherPred == nullptr)
		M.NumReaders++;
		else if (OtherPred != nullptr) {
		if (isAvailable(OtherPred))
		M.NumReadersWithMiddlePred++;
		}
		}
		}
		}

		int SystemZHazardRecognizer::
		analyzeB2BWrite(B2BWriteMetrics &M, unsigned CurrSlot) {
		// Give negative cost if user could be put in same group. The value
		// reflects the relative order between the different cases.

		if (DUMP_B2B)
		dbgs() << "++ B2B Writer: NumReaders:" << M.NumReaders
		<< " NumReadersWithMiddlePred:" << M.NumReadersWithMiddlePred << "\n";

		// If there is a benefit, take it, or reject if better to wait.
		if (CurrSlot == 0) {
		if (M.NumReaders > 1)
		return -2;
		if (M.NumReadersWithMiddlePred)
		return -2;
		if (M.NumReaders == 1)
		return -1;
		} else if (CurrSlot == 1) {
		if (M.NumReaders)
		return -1;
		if (M.NumReadersWithMiddlePred)
		return 2;
		} else {
		assert(CurrSlot == 2);
		if (M.NumReaders \|\| M.NumReadersWithMiddlePred)
		return 2;
		}
		return 0;
		}

ScheduleHazardRecognizer::HazardType SystemZHazardRecognizer::		ScheduleHazardRecognizer::HazardType SystemZHazardRecognizer::
getHazardType(SUnit *SU, int Stalls) {		getHazardType(SUnit *SU, int Stalls) {
return (fitsIntoCurrentGroup(SU) ? NoHazard : Hazard);		return (fitsIntoCurrentGroup(SU) ? NoHazard : Hazard);
}		}

void SystemZHazardRecognizer::Reset() {		void SystemZHazardRecognizer::Reset() {
CurrGroupSize = 0;		CurrGroupSize = 0;
CurrGroupHas4RegOps = false;		CurrGroupHas4RegOps = false;
clearProcResCounters();		clearProcResCounters();
GrpCount = 0;		GrpCount = 0;
LastFPdOpCycleIdx = UINT_MAX;		LastFPdOpCycleIdx = UINT_MAX;
LastEmittedMI = nullptr;		LastEmittedMI = nullptr;
		B2BWSides.clear();
LLVM_DEBUG(CurGroupDbg = "";);		LLVM_DEBUG(CurGroupDbg = "";);
}		}

bool		bool
SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const {		SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const {
const MCSchedClassDesc *SC = getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return true;		return true;
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	if (CriticalResourceIdx != UINT_MAX &&
ProcResCostLim))		ProcResCostLim))
CriticalResourceIdx = UINT_MAX;		CriticalResourceIdx = UINT_MAX;

LLVM_DEBUG(dumpState(););		LLVM_DEBUG(dumpState(););
}		}

#ifndef NDEBUG // Debug output		#ifndef NDEBUG // Debug output
void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const {		void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const {
		const MachineInstr *MI = SU->getInstr();
OS << "SU(" << SU->NodeNum << "):";		OS << "SU(" << SU->NodeNum << "):";
OS << TII->getName(SU->getInstr()->getOpcode());		OS << TII->getName(MI->getOpcode());

const MCSchedClassDesc *SC = getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return;		return;

for (TargetSchedModel::ProcResIter		for (TargetSchedModel::ProcResIter
PI = SchedModel->getWriteProcResBegin(SC),		PI = SchedModel->getWriteProcResBegin(SC),
PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {
const MCProcResourceDesc &PRD =		unsigned PIdx = PI->ProcResourceIdx;
*SchedModel->getProcResource(PI->ProcResourceIdx);		if (!isCountedProcResource(PIdx))
		continue;
		const MCProcResourceDesc &PRD = *SchedModel->getProcResource(PIdx);
std::string FU(PRD.Name);		std::string FU(PRD.Name);
// trim e.g. Z13_FXaUnit -> FXa		// trim e.g. Z13_FXaUnit -> FXa
FU = FU.substr(FU.find('_') + 1);		FU = FU.substr(FU.find('_') + 1);
size_t Pos = FU.find("Unit");		size_t Pos = FU.find("Unit");
if (Pos != std::string::npos)		if (Pos != std::string::npos)
FU.resize(Pos);		FU.resize(Pos);
if (FU == "LS") // LSUnit -> LSU		if (FU == "LS") // LSUnit -> LSU
FU = "LSU";		FU = "LSU";
OS << "/" << FU;		OS << "/" << FU;

if (PI->Cycles > 1)		if (PI->Cycles > 1)
OS << "(" << PI->Cycles << "cyc)";		OS << "(" << PI->Cycles << "cyc)";
}		}

if (SC->NumMicroOps > 1)		if (SC->NumMicroOps > 1)
OS << "/" << SC->NumMicroOps << "uops";		OS << "/" << SC->NumMicroOps << "uops";
if (SC->BeginGroup && SC->EndGroup)		if (SC->BeginGroup && SC->EndGroup)
OS << "/GroupsAlone";		OS << "/GroupsAlone";
else if (SC->BeginGroup)		else if (SC->BeginGroup)
OS << "/BeginsGroup";		OS << "/BeginsGroup";
else if (SC->EndGroup)		else if (SC->EndGroup)
OS << "/EndsGroup";		OS << "/EndsGroup";
if (SU->isUnbuffered)		if (SU->isUnbuffered)
OS << "/Unbuffered";		OS << "/Unbuffered";
if (has4RegOps(SU->getInstr()))		if (has4RegOps(MI))
OS << "/4RegOps";		OS << "/4RegOps";

		if (isB2BOpIdx(SU, 0, true/Def/))
		OS << "/" << TRI->getName(MI->getOperand(0).getReg()) << ":B2BW";
		for (unsigned Idx = 0; Idx < MI->getNumOperands(); ++Idx)
		if (isB2BOpIdx(SU, Idx, false/Def/))
		OS << "/" << TRI->getName(MI->getOperand(Idx).getReg()) << ":B2BR";
}		}

void SystemZHazardRecognizer::dumpCurrGroup(std::string Msg) const {		void SystemZHazardRecognizer::dumpCurrGroup(std::string Msg) const {
dbgs() << "++ " << Msg;		dbgs() << "++ " << Msg;
dbgs() << ": ";		dbgs() << ": ";

if (CurGroupDbg.empty())		if (CurGroupDbg.empty())
dbgs() << " <empty>\n";		dbgs() << " <empty>\n";
Show All 28 Lines	void SystemZHazardRecognizer::dumpProcResourceCounters() const {
if (CriticalResourceIdx != UINT_MAX)		if (CriticalResourceIdx != UINT_MAX)
dbgs() << "++ \| Critical resource: "		dbgs() << "++ \| Critical resource: "
<< SchedModel->getProcResource(CriticalResourceIdx)->Name		<< SchedModel->getProcResource(CriticalResourceIdx)->Name
<< "\n";		<< "\n";
}		}

void SystemZHazardRecognizer::dumpState() const {		void SystemZHazardRecognizer::dumpState() const {
dumpCurrGroup("\| Current decoder group");		dumpCurrGroup("\| Current decoder group");
dbgs() << "++ \| Current cycle index: "		dbgs() << "++ \| Current side: " << (getCurrCycleIdx() < 3 ? "Left" : "Right")
<< getCurrCycleIdx() << "\n";		<< "\n";
dumpProcResourceCounters();		dumpProcResourceCounters();
if (LastFPdOpCycleIdx != UINT_MAX)		if (LastFPdOpCycleIdx != UINT_MAX)
dbgs() << "++ \| Last FPd cycle index: " << LastFPdOpCycleIdx << "\n";		dbgs() << "++ \| Last FPd cycle index: " << LastFPdOpCycleIdx << "\n";
}		}

#endif //NDEBUG		#endif //NDEBUG

void SystemZHazardRecognizer::clearProcResCounters() {		void SystemZHazardRecognizer::clearProcResCounters() {
Show All 34 Lines	EmitInstruction(SUnit *SU) {

// Increase counter for execution unit(s).		// Increase counter for execution unit(s).
for (TargetSchedModel::ProcResIter		for (TargetSchedModel::ProcResIter
PI = SchedModel->getWriteProcResBegin(SC),		PI = SchedModel->getWriteProcResBegin(SC),
PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {
// Don't handle FPd together with the other resources.		// Don't handle FPd together with the other resources.
if (SchedModel->getProcResource(PI->ProcResourceIdx)->BufferSize == 1)		if (SchedModel->getProcResource(PI->ProcResourceIdx)->BufferSize == 1)
continue;		continue;
		if (!isCountedProcResource(PI->ProcResourceIdx))
		continue;
int &CurrCounter =		int &CurrCounter =
ProcResourceCounters[PI->ProcResourceIdx];		ProcResourceCounters[PI->ProcResourceIdx];
CurrCounter += PI->Cycles;		CurrCounter += PI->Cycles;
// Check if this is now the new critical resource.		// Check if this is now the new critical resource.
if ((CurrCounter > ProcResCostLim) &&		if ((CurrCounter > ProcResCostLim) &&
(CriticalResourceIdx == UINT_MAX \|\|		(CriticalResourceIdx == UINT_MAX \|\|
(PI->ProcResourceIdx != CriticalResourceIdx &&		(PI->ProcResourceIdx != CriticalResourceIdx &&
CurrCounter >		CurrCounter >
ProcResourceCounters[CriticalResourceIdx]))) {		ProcResourceCounters[CriticalResourceIdx]))) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "++ New critical resource: "		dbgs() << "++ New critical resource: "
<< SchedModel->getProcResource(PI->ProcResourceIdx)->Name		<< SchedModel->getProcResource(PI->ProcResourceIdx)->Name
<< "\n";);		<< "\n";);
CriticalResourceIdx = PI->ProcResourceIdx;		CriticalResourceIdx = PI->ProcResourceIdx;
}		}
}		}

// Make note of an instruction that uses a blocking resource (FPd).		// Make note of an instruction that uses a blocking resource (FPd).
if (SU->isUnbuffered) {		if (SU->isUnbuffered) {
LastFPdOpCycleIdx = getCurrCycleIdx(SU);		LastFPdOpCycleIdx = getCurrCycleIdx(SU);
LLVM_DEBUG(dbgs() << "++ Last FPd cycle index: " << LastFPdOpCycleIdx		LLVM_DEBUG(dbgs() << "++ Last FPd cycle index: " << LastFPdOpCycleIdx
<< "\n";);		<< "\n";);
}		}

		if (isB2BOpIdx(SU, 0, true/Def/))
		B2BWSides[SU] = willGoLeft(SU);

// Insert SU into current group by increasing number of slots used		// Insert SU into current group by increasing number of slots used
// in current group.		// in current group.
CurrGroupSize += getNumDecoderSlots(SU);		CurrGroupSize += getNumDecoderSlots(SU);
CurrGroupHas4RegOps \|= has4RegOps(SU->getInstr());		CurrGroupHas4RegOps \|= has4RegOps(SU->getInstr());
unsigned GroupLim = (CurrGroupHas4RegOps ? 2 : 3);		unsigned GroupLim = (CurrGroupHas4RegOps ? 2 : 3);
assert((CurrGroupSize <= GroupLim \|\| CurrGroupSize == getNumDecoderSlots(SU))		assert((CurrGroupSize <= GroupLim \|\| CurrGroupSize == getNumDecoderSlots(SU))
&& "SU does not fit into decoder group!");		&& "SU does not fit into decoder group!");

▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	for (TargetSchedModel::ProcResIter
PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)
if (PI->ProcResourceIdx == CriticalResourceIdx)		if (PI->ProcResourceIdx == CriticalResourceIdx)
Cost = PI->Cycles;		Cost = PI->Cycles;
}		}

return Cost;		return Cost;
}		}

		void SystemZHazardRecognizer::B2BReadCost(SUnit *SU, unsigned &NumGood,
		unsigned &NumBad, bool FlipSide) const {
		int RegsSameSide = 0;
		int RegsOppositeSide = 0;
		for (SDep &Pred : SU->Preds) {
		SUnit *P = Pred.getSUnit();
		if (Pred.getKind() == SDep::Data && isB2BEdge(P, SU, Pred.getReg())) {
		if (B2BWSides.count(P)) {
		if (wasScheduledLeft(P) == (willGoLeft(SU) ^ FlipSide))
		RegsSameSide++;
		else
		RegsOppositeSide++;
		}
		}
		}
		NumGood = RegsSameSide;
		NumBad = RegsOppositeSide;
		}

		int SystemZHazardRecognizer::B2BCost(SUnit *SU) {
		unsigned CurrIdx = getCurrCycleIdx(SU);
		bool OnLastSlot = (CurrIdx == 2 \|\| CurrIdx == 5);
		bool OnFirstSlot = (CurrIdx == 0 \|\| CurrIdx == 3);

		// Put B2BR register use on same side as previous def(s).
		unsigned NumGood = 0;
		unsigned NumBad = 0;
		B2BReadCost(SU, NumGood, NumBad);
		if (!NumBad && NumGood)
		return -4;
		if (!NumGood && NumBad)
		return 4;

		B2BWriteMetrics M;
		B2BWriteCost(SU, M);
		int WriteCost = analyzeB2BWrite(M, getCurrCycleIdx(SU) % 3);
		if (WriteCost != 0)
		return WriteCost;

		// If SU would make a B2B successor available in same group, schedule it.
		if (!OnLastSlot)
		for (SDep &SuccDep : SU->Succs) {
		SUnit *SuccSU = SuccDep.getSUnit();
		if (!SuccSU->isBoundaryNode() && !getSchedClass(SuccSU)->BeginGroup &&
		isAvailable(SuccSU, SU)) {
		B2BReadCost(SuccSU, NumGood, NumBad);
		if (!NumBad && NumGood)
		return -1;
		if (OnFirstSlot) {
		B2BWriteMetrics M;
		B2BWriteCost(SuccSU, M);
		if (M.NumReaders)
		return -1;
		}
		}
		}

		return 0;
		}

void SystemZHazardRecognizer::emitInstruction(MachineInstr *MI,		void SystemZHazardRecognizer::emitInstruction(MachineInstr *MI,
bool TakenBranch) {		bool TakenBranch) {
// Make a temporary SUnit.		// Make a temporary SUnit.
SUnit SU(MI, 0);		SUnit SU(MI, 0);

// Set interesting flags.		// Set interesting flags.
SU.isCall = MI->isCall();		SU.isCall = MI->isCall();

▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZMachineScheduler.h

	Show All 25 Lines

	namespace llvm {			namespace llvm {

	/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.			/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.
	class SystemZPostRASchedStrategy : public MachineSchedStrategy {			class SystemZPostRASchedStrategy : public MachineSchedStrategy {

	const MachineLoopInfo *MLI;			const MachineLoopInfo *MLI;
	const SystemZInstrInfo *TII;			const SystemZInstrInfo *TII;
				const TargetRegisterInfo *TRI;

	// A SchedModel is needed before any DAG is built while advancing past			// A SchedModel is needed before any DAG is built while advancing past
	// non-scheduled instructions, so it would not always be possible to call			// non-scheduled instructions, so it would not always be possible to call
	// DAG->getSchedClass(SU).			// DAG->getSchedClass(SU).
	TargetSchedModel SchedModel;			TargetSchedModel SchedModel;

	/// A candidate during instruction evaluation.			/// A candidate during instruction evaluation.
	struct Candidate {			struct Candidate {
	SUnit *SU = nullptr;			SUnit *SU = nullptr;

	/// The decoding cost.			/// The decoding cost.
	int GroupingCost = 0;			int GroupingCost = 0;

	/// The processor resources cost.			/// The processor resources cost.
	int ResourcesCost = 0;			int ResourcesCost = 0;

				/// The (negative) B2B cost of FXU registers.
				int B2BCost = 0;

	Candidate() = default;			Candidate() = default;
	Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec);			Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec);

	// Compare two candidates.			// Compare two candidates.
	bool operator<(const Candidate &other);			bool operator<(const Candidate &other);

	// Check if this node is free of cost ("as good as any").			// Check if this node is free of cost ("as good as any").
	bool noCost() const {			bool noCost(unsigned &NumTried) const {
	return (GroupingCost <= 0 && !ResourcesCost);			if (GroupingCost <= 0 && !ResourcesCost) {
				if (B2BCost == -4 \|\| NumTried >= 6)
				return true;
				else
				NumTried++;
				}
				return false;
	}			}

	#ifndef NDEBUG			#ifndef NDEBUG
	void dumpCosts() {			void dumpCosts() {
	if (GroupingCost != 0)			if (GroupingCost != 0)
	dbgs() << " Grouping cost:" << GroupingCost;			dbgs() << " Grouping cost:" << GroupingCost;
	if (ResourcesCost != 0)			if (ResourcesCost != 0)
	dbgs() << " Resource cost:" << ResourcesCost;			dbgs() << " Resource cost:" << ResourcesCost;
				if (B2BCost != 0)
				dbgs() << " B2BCost:" << B2BCost;
	}			}
	#endif			#endif
	};			};

	// A sorter for the Available set that makes sure that SUs are considered			// A sorter for the Available set that makes sure that SUs are considered
	// in the best order.			// in the best order.
	struct SUSorter {			struct SUSorter {
	bool operator() (SUnit lhs, SUnit rhs) const {			bool operator() (SUnit lhs, SUnit rhs) const {
	▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZMachineScheduler.cpp

Show All 10 Lines
// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()		// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
// implementation that looks to optimize decoder grouping and balance the		// implementation that looks to optimize decoder grouping and balance the
// usage of processor resources. Scheduler states are saved for the end		// usage of processor resources. Scheduler states are saved for the end
// region of each MBB, so that a successor block can learn from it.		// region of each MBB, so that a successor block can learn from it.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZMachineScheduler.h"		#include "SystemZMachineScheduler.h"
#include "llvm/CodeGen/MachineLoopInfo.h"		#include "llvm/CodeGen/MachineLoopInfo.h"
		#include "llvm/CodeGen/MachineRegisterInfo.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "machine-scheduler"		#define DEBUG_TYPE "machine-scheduler"

#ifndef NDEBUG		#ifndef NDEBUG
// Print the set of SUs		// Print the set of SUs
void SystemZPostRASchedStrategy::SUSet::		void SystemZPostRASchedStrategy::SUSet::
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	for (; I != NextBegin; ++I) {
if (I->isPosition() \|\| I->isDebugInstr())		if (I->isPosition() \|\| I->isDebugInstr())
continue;		continue;
HazardRec->emitInstruction(&*I);		HazardRec->emitInstruction(&*I);
}		}
}		}

void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) {		void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) {
Available.clear(); // -misched-cutoff.		Available.clear(); // -misched-cutoff.
		HazardRec->clearSUMappings();
LLVM_DEBUG(HazardRec->dumpState(););		LLVM_DEBUG(HazardRec->dumpState(););
}		}

void SystemZPostRASchedStrategy::enterMBB(MachineBasicBlock *NextMBB) {		void SystemZPostRASchedStrategy::enterMBB(MachineBasicBlock *NextMBB) {
assert ((SchedStates.find(NextMBB) == SchedStates.end()) &&		assert ((SchedStates.find(NextMBB) == SchedStates.end()) &&
"Entering MBB twice?");		"Entering MBB twice?");
LLVM_DEBUG(dbgs() << "** Entering " << printMBBReference(*NextMBB));		LLVM_DEBUG(dbgs() << "** Entering " << printMBBReference(*NextMBB));

MBB = NextMBB;		MBB = NextMBB;
		const MachineLoop *Loop = MLI->getLoopFor(MBB);

/// Create a HazardRec for MBB, save it in SchedStates and set HazardRec to		/// Create a HazardRec for MBB, save it in SchedStates and set HazardRec to
/// point to it.		/// point to it.
HazardRec = SchedStates[MBB] = new SystemZHazardRecognizer(TII, &SchedModel);		HazardRec = SchedStates[MBB] = new SystemZHazardRecognizer(TII, TRI,
LLVM_DEBUG(const MachineLoop *Loop = MLI->getLoopFor(MBB);		&SchedModel);
if (Loop && Loop->getHeader() == MBB) dbgs() << " (Loop header)";		LLVM_DEBUG(if(Loop && Loop->getHeader() == MBB)
		dbgs() << " (Loop header)";
dbgs() << ":\n";);		dbgs() << ":\n";);

// Try to take over the state from a single predecessor, if it has been		// Try to take over the state from a single predecessor, if it has been
// scheduled. If this is not possible, we are done.		// scheduled. If this is not possible, we are done.
MachineBasicBlock *SinglePredMBB =		MachineBasicBlock *SinglePredMBB = getSingleSchedPred(MBB, Loop);
getSingleSchedPred(MBB, MLI->getLoopFor(MBB));
if (SinglePredMBB == nullptr \|\|		if (SinglePredMBB == nullptr \|\|
SchedStates.find(SinglePredMBB) == SchedStates.end())		SchedStates.find(SinglePredMBB) == SchedStates.end())
return;		return;

LLVM_DEBUG(dbgs() << "** Continued scheduling from "		LLVM_DEBUG(dbgs() << "** Continued scheduling from "
<< printMBBReference(*SinglePredMBB) << "\n";);		<< printMBBReference(*SinglePredMBB) << "\n";);

HazardRec->copyState(SchedStates[SinglePredMBB]);		HazardRec->copyState(SchedStates[SinglePredMBB]);
Show All 21 Lines	void SystemZPostRASchedStrategy::leaveMBB() {
advanceTo(MBB->getFirstTerminator());		advanceTo(MBB->getFirstTerminator());
}		}

SystemZPostRASchedStrategy::		SystemZPostRASchedStrategy::
SystemZPostRASchedStrategy(const MachineSchedContext *C)		SystemZPostRASchedStrategy(const MachineSchedContext *C)
: MLI(C->MLI),		: MLI(C->MLI),
TII(static_cast<const SystemZInstrInfo *>		TII(static_cast<const SystemZInstrInfo *>
(C->MF->getSubtarget().getInstrInfo())),		(C->MF->getSubtarget().getInstrInfo())),
		TRI(C->MF->getRegInfo().getTargetRegisterInfo()),
MBB(nullptr), HazardRec(nullptr) {		MBB(nullptr), HazardRec(nullptr) {
const TargetSubtargetInfo *ST = &C->MF->getSubtarget();		const TargetSubtargetInfo *ST = &C->MF->getSubtarget();
SchedModel.init(ST);		SchedModel.init(ST);
}		}

SystemZPostRASchedStrategy::~SystemZPostRASchedStrategy() {		SystemZPostRASchedStrategy::~SystemZPostRASchedStrategy() {
// Delete hazard recognizers kept around for each MBB.		// Delete hazard recognizers kept around for each MBB.
for (auto I : SchedStates) {		for (auto I : SchedStates) {
Show All 27 Lines	LLVM_DEBUG(dbgs() << "** Only one: ";
HazardRec->dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);		HazardRec->dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);
return *Available.begin();		return *Available.begin();
}		}

// All nodes that are possible to schedule are stored in the Available set.		// All nodes that are possible to schedule are stored in the Available set.
LLVM_DEBUG(dbgs() << "** Available: "; Available.dump(*HazardRec););		LLVM_DEBUG(dbgs() << "** Available: "; Available.dump(*HazardRec););

Candidate Best;		Candidate Best;
		unsigned NumTried = 0;
for (auto *SU : Available) {		for (auto *SU : Available) {

// SU is the next candidate to be compared against current Best.		// SU is the next candidate to be compared against current Best.
Candidate c(SU, *HazardRec);		Candidate c(SU, *HazardRec);

// Remeber which SU is the best candidate.		// Remeber which SU is the best candidate.
if (Best.SU == nullptr \|\| c < Best) {		if (Best.SU == nullptr \|\| c < Best) {
Best = c;		Best = c;
LLVM_DEBUG(dbgs() << "** Best so far: ";);		LLVM_DEBUG(dbgs() << "** Best so far: ";);
} else		} else
LLVM_DEBUG(dbgs() << "** Tried : ";);		LLVM_DEBUG(dbgs() << "** Tried : ";);
LLVM_DEBUG(HazardRec->dumpSU(c.SU, dbgs()); c.dumpCosts();		LLVM_DEBUG(HazardRec->dumpSU(c.SU, dbgs()); c.dumpCosts();
dbgs() << " Height:" << c.SU->getHeight(); dbgs() << "\n";);		dbgs() << " Height:" << c.SU->getHeight(); dbgs() << "\n";);

// Once we know we have seen all SUs that affect grouping or use unbuffered		// Once we know we have seen all SUs that affect grouping or use unbuffered
// resources, we can stop iterating if Best looks good.		// resources, we can stop iterating if Best looks good.
if (!SU->isScheduleHigh && Best.noCost())		if (!SU->isScheduleHigh && Best.noCost(NumTried))
break;		break;
}		}

assert (Best.SU != nullptr);		assert (Best.SU != nullptr);
return Best.SU;		return Best.SU;
}		}

		// EXPERIMENTAL
		#include "llvm/Support/CommandLine.h"
		static cl::opt<bool> SIDESTEERING_FXU("sidesteer-fxu", cl::init(false));

SystemZPostRASchedStrategy::Candidate::		SystemZPostRASchedStrategy::Candidate::
Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec) : Candidate() {		Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec) : Candidate() {
SU = SU_;		SU = SU_;

// Check the grouping cost. For a node that must begin / end a		// Check the grouping cost. For a node that must begin / end a
// group, it is positive if it would do so prematurely, or negative		// group, it is positive if it would do so prematurely, or negative
// if it would fit naturally into the schedule.		// if it would fit naturally into the schedule.
GroupingCost = HazardRec.groupingCost(SU);		GroupingCost = HazardRec.groupingCost(SU);

// Check the resources cost for this SU.		// Check the resources cost for this SU.
ResourcesCost = HazardRec.resourcesCost(SU);		ResourcesCost = HazardRec.resourcesCost(SU);

		// Side steering
		if (SIDESTEERING_FXU)
		B2BCost = HazardRec.B2BCost(SU);
}		}

		// EXPERIMENTAL
		#include "llvm/Support/CommandLine.h"
		static cl::opt<int> B2B_HCUT("b2b-hcut", cl::init(INT_MIN));

bool SystemZPostRASchedStrategy::Candidate::		bool SystemZPostRASchedStrategy::Candidate::
operator<(const Candidate &other) {		operator<(const Candidate &other) {

// Check decoder grouping.		// Check decoder grouping.
if (GroupingCost < other.GroupingCost)		if (GroupingCost < other.GroupingCost)
return true;		return true;
if (GroupingCost > other.GroupingCost)		if (GroupingCost > other.GroupingCost)
return false;		return false;

// Compare the use of resources.		// Compare the use of resources.
if (ResourcesCost < other.ResourcesCost)		if (ResourcesCost < other.ResourcesCost)
return true;		return true;
if (ResourcesCost > other.ResourcesCost)		if (ResourcesCost > other.ResourcesCost)
return false;		return false;

		// Try to help FXU bypassing.
		int HDiff = (((int) SU->getHeight()) - ((int) other.SU->getHeight()));
		if (B2BCost < other.B2BCost && HDiff >= B2B_HCUT)
		return true;
		if (B2BCost > other.B2BCost && HDiff <= -(B2B_HCUT)) // XXX needed?
		return false;

// Higher SU is otherwise generally better.		// Higher SU is otherwise generally better.
if (SU->getHeight() > other.SU->getHeight())		if (SU->getHeight() > other.SU->getHeight())
return true;		return true;
if (SU->getHeight() < other.SU->getHeight())		if (SU->getHeight() < other.SU->getHeight())
return false;		return false;

// If all same, fall back to original order.		// If all same, fall back to original order.
if (SU->NodeNum < other.SU->NodeNum)		if (SU->NodeNum < other.SU->NodeNum)
return true;		return true;

return false;		return false;
}		}

void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {		void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {
LLVM_DEBUG(dbgs() << "** Scheduling SU(" << SU->NodeNum << ") ";		LLVM_DEBUG(dbgs() << "** Scheduling SU(" << SU->NodeNum << ") ";
if (Available.size() == 1) dbgs() << "(only one) ";		if (Available.size() == 1) dbgs() << "(only one) ";
Candidate c(SU, *HazardRec); c.dumpCosts(); dbgs() << "\n";);		// Candidate c(SU, *HazardRec); c.dumpCosts();
		dbgs() << "\n";);

// Remove SU from Available set and update HazardRec.		// Remove SU from Available set and update HazardRec.
Available.erase(SU);		Available.erase(SU);
HazardRec->EmitInstruction(SU);		HazardRec->EmitInstruction(SU);
}		}

void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {		void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {
// Set isScheduleHigh flag on all SUs that we want to consider first in		// Set isScheduleHigh flag on all SUs that we want to consider first in
// pickNode().		// pickNode().
const MCSchedClassDesc *SC = HazardRec->getSchedClass(SU);		const MCSchedClassDesc *SC = HazardRec->getSchedClass(SU);
bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup));		bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup));
SU->isScheduleHigh = (AffectsGrouping \|\| SU->isUnbuffered);		SU->isScheduleHigh = (AffectsGrouping \|\| SU->isUnbuffered);

// Put all released SUs in the Available set.		// Put all released SUs in the Available set.
Available.insert(SU);		Available.insert(SU);
}		}

llvm/lib/Target/SystemZ/SystemZSchedule.td

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	}			}

	def VecFPd : SchedWrite; // Blocking BFP div/sqrt unit.			def VecFPd : SchedWrite; // Blocking BFP div/sqrt unit.

	def VBU : SchedWrite; // Virtual branching unit			def VBU : SchedWrite; // Virtual branching unit

	def MCD : SchedWrite; // Millicode			def MCD : SchedWrite; // Millicode

				def B2BRW : SchedWrite; // Reads and Writes the back-to-back vector.
				def B2BRn : SchedWrite; // Reads the back-to-back vector.
				def B2BnW : SchedWrite; // Writes the back-to-back vector.

	include "SystemZScheduleZ15.td"			include "SystemZScheduleZ15.td"
	include "SystemZScheduleZ14.td"			include "SystemZScheduleZ14.td"
	include "SystemZScheduleZ13.td"			include "SystemZScheduleZ13.td"
	include "SystemZScheduleZEC12.td"			include "SystemZScheduleZEC12.td"
	include "SystemZScheduleZ196.td"			include "SystemZScheduleZ196.td"

llvm/lib/Target/SystemZ/SystemZScheduleZ14.td

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
// Execution units.		// Execution units.
def Z14_FXaUnit : ProcResource<2>;		def Z14_FXaUnit : ProcResource<2>;
def Z14_FXbUnit : ProcResource<2>;		def Z14_FXbUnit : ProcResource<2>;
def Z14_LSUnit : ProcResource<2>;		def Z14_LSUnit : ProcResource<2>;
def Z14_VecUnit : ProcResource<2>;		def Z14_VecUnit : ProcResource<2>;
def Z14_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ }		def Z14_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ }
def Z14_VBUnit : ProcResource<2>;		def Z14_VBUnit : ProcResource<2>;
def Z14_MCD : ProcResource<1>;		def Z14_MCD : ProcResource<1>;
		def Z14_B2BRW : ProcResource<6>; // Avoid becoming the critical resource
		def Z14_B2BRn : ProcResource<6>;
		def Z14_B2BnW : ProcResource<6>;

// Subtarget specific definitions of scheduling resources.		// Subtarget specific definitions of scheduling resources.
let NumMicroOps = 0 in {		let NumMicroOps = 0 in {
def : WriteRes<FXa, [Z14_FXaUnit]>;		def : WriteRes<FXa, [Z14_FXaUnit]>;
def : WriteRes<FXb, [Z14_FXbUnit]>;		def : WriteRes<FXb, [Z14_FXbUnit]>;
def : WriteRes<LSU, [Z14_LSUnit]>;		def : WriteRes<LSU, [Z14_LSUnit]>;
def : WriteRes<VecBF, [Z14_VecUnit]>;		def : WriteRes<VecBF, [Z14_VecUnit]>;
def : WriteRes<VecDF, [Z14_VecUnit]>;		def : WriteRes<VecDF, [Z14_VecUnit]>;
Show All 17 Lines	let NumMicroOps = 0 in {

def : WriteRes<VBU, [Z14_VBUnit]>; // Virtual Branching Unit		def : WriteRes<VBU, [Z14_VBUnit]>; // Virtual Branching Unit
}		}

def : WriteRes<MCD, [Z14_MCD]> { let NumMicroOps = 3;		def : WriteRes<MCD, [Z14_MCD]> { let NumMicroOps = 3;
let BeginGroup = 1;		let BeginGroup = 1;
let EndGroup = 1; }		let EndGroup = 1; }

		let NumMicroOps = 0 in {
		def : WriteRes<B2BRW, [Z14_B2BRW]> {}
		def : WriteRes<B2BRn, [Z14_B2BRn]> {}
		def : WriteRes<B2BnW, [Z14_B2BnW]> {}
		}

// -------------------------- INSTRUCTIONS ---------------------------------- //		// -------------------------- INSTRUCTIONS ---------------------------------- //

// InstRW constructs have been used in order to preserve the		// InstRW constructs have been used in order to preserve the
// readability of the InstrInfo files.		// readability of the InstrInfo files.

// For each instruction, as matched by a regexp, provide a list of		// For each instruction, as matched by a regexp, provide a list of
// resources that it needs. These will be combined into a SchedClass.		// resources that it needs. These will be combined into a SchedClass.

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Stack allocation		// Stack allocation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Pseudo -> LA / LAY		// Pseudo -> LA / LAY
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ADJDYNALLOC$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ADJDYNALLOC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Branch instructions		// Branch instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Branch		// Branch
def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?BRC(L)?(Asm.*)?$")>;		def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?BRC(L)?(Asm.*)?$")>;
def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?J(G)?(Asm.*)?$")>;		def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?J(G)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?BC(R)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?BC(R)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?B(R)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?B(R)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "BI(C)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "BI(C)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXa, EndGroup], (instregex "BRCT(G)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, EndGroup], (instregex "BRCT(G)?$")>;
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BRCTH$")>;		def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BRCTH$")>;
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BCT(G)?(R)?$")>;		def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BCT(G)?(R)?$")>;
def : InstRW<[WLat1, FXa2, FXb2, GroupAlone2],		def : InstRW<[WLat1, FXa2, FXb2, GroupAlone2],
(instregex "B(R)?X(H\|L).*$")>;		(instregex "B(R)?X(H\|L).*$")>;

// Compare and branch		// Compare and branch
def : InstRW<[WLat1, FXb, NormalGr], (instregex "C(L)?(G)?(I\|R)J(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "C(L)?(G)?(I\|R)J(Asm.*)?$")>;
def : InstRW<[WLat1, FXb2, GroupAlone],		def : InstRW<[WLat1, FXb2, GroupAlone],
Show All 12 Lines
def : InstRW<[WLat1, FXb, NormalGr], (instregex "CL(F\|G)IT(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "CL(F\|G)IT(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "CL(G)?T(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "CL(G)?T(Asm.*)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Call and return instructions		// Call and return instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Call		// Call
def : InstRW<[WLat1, VBU, FXa2, GroupAlone], (instregex "(Call)?BRAS$")>;		def : InstRW<[WLat1, VBU, FXa2, B2BRW, GroupAlone], (instregex "(Call)?BRAS$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "(Call)?BRASL$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone], (instregex "(Call)?BRASL$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "(Call)?BAS(R)?$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone],
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;		(instregex "(Call)?BAS(R)?$")>;
		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone],
		(instregex "TLS_(G\|L)DCALL$")>;

// Return		// Return
def : InstRW<[WLat1, FXb, EndGroup], (instregex "Return$")>;		def : InstRW<[WLat1, FXb, EndGroup], (instregex "Return$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "CondReturn$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "CondReturn$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Move instructions		// Move instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Moves		// Moves
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MV(G\|H)?HI$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MV(G\|H)?HI$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MVI(Y)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MVI(Y)?$")>;

// Move character		// Move character
def : InstRW<[WLat1, FXb, LSU3, GroupAlone], (instregex "MVC$")>;		def : InstRW<[WLat1, FXb, LSU3, GroupAlone], (instregex "MVC$")>;
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVCL(E\|U)?$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVCL(E\|U)?$")>;

// Pseudo -> reg move		// Pseudo -> reg move
def : InstRW<[WLat1, FXa, NormalGr], (instregex "COPY(_TO_REGCLASS)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "COPY(_TO_REGCLASS)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "EXTRACT_SUBREG$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "EXTRACT_SUBREG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "INSERT_SUBREG$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "INSERT_SUBREG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "REG_SEQUENCE$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "REG_SEQUENCE$")>;

// Loads		// Loads
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L(Y\|FH\|RL\|Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
def : InstRW<[LSULatency, LSULatency, LSU, NormalGr], (instregex "LCBB$")>;		def : InstRW<[LSULatency, LSULatency, LSU, NormalGr], (instregex "LCBB$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LG(RL)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LG(RL)?$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L128$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L128$")>;

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLIH(F\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LLI(H\|L)F$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLIL(F\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLI(H\|L)(H\|L)$")>;

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LG(F\|H)I$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LGFI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LHI(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LGHI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LHI(Mux)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LR$")>;

// Load and zero rightmost byte		// Load and zero rightmost byte
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LZR(F\|G)$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LZR(F\|G)$")>;

// Load and trap		// Load and trap
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "L(FH\|G)?AT$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "L(FH\|G)?AT$")>;

// Load and test		// Load and test
def : InstRW<[WLat1LSU, WLat1LSU, LSU, FXa, NormalGr], (instregex "LT(G)?$")>;		def : InstRW<[WLat1LSU, WLat1LSU, LSU, FXa, B2BRW, NormalGr],
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LT(G)?R$")>;		(instregex "LT(G)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LT(G)?R$")>;

// Stores		// Stores
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STG(RL)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STG(RL)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST128$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST128$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;

// String moves.		// String moves.
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVST$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVST$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Conditional move instructions		// Conditional move instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOCRMux$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "LOCRMux$")>;
def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOC(G\|FH)?R(Asm.*)?$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "LOC(G\|FH)?R(Asm.*)?$")>;
def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOC(G\|H)?HI(Mux\|(Asm.*))?$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr],
def : InstRW<[WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		(instregex "LOC(G\|H)?HI(Mux\|(Asm.*))?$")>;
		def : InstRW<[WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, NormalGr],
(instregex "LOC(G\|FH\|Mux)?(Asm.*)?$")>;		(instregex "LOC(G\|FH\|Mux)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr],		def : InstRW<[WLat1, FXb, LSU, NormalGr],
(instregex "STOC(G\|FH\|Mux)?(Asm.*)?$")>;		(instregex "STOC(G\|FH\|Mux)?(Asm.*)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sign extensions		// Sign extensions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "L(B\|H\|G)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "L(B\|H\|G)R$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LG(B\|H\|F)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LG(B\|H\|F)R$")>;

def : InstRW<[WLat1LSU, WLat1LSU, FXa, LSU, NormalGr], (instregex "LTGF$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LTGFR$")>;

def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LB(H\|Mux)?$")>;		def : InstRW<[WLat1LSU, WLat1LSU, FXa, LSU, B2BnW, NormalGr],
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LH(Y)?$")>;		(instregex "LTGF$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LH(H\|Mux\|RL)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LTGFR$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LG(B\|H\|F)$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LG(H\|F)RL$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LB(H\|Mux)?$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LH(Y)?$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LH(H\|Mux\|RL)$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LGB$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BRW, NormalGr], (instregex "LG(H\|F)$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LG(H\|F)RL$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Zero extensions		// Zero extensions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLCR(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLCR(Mux)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLHR(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLHR(Mux)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLG(C\|H\|F\|T)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLG(C\|H\|F\|T)R$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLC(Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLC(Mux)?$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLH(Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLH(Mux)?$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LL(C\|H)H$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LL(C\|H)H$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLHRL$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLHRL$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLG(C\|H\|F\|T\|HRL\|FRL)$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLG(C\|H\|F\|T\|HRL\|FRL)$")>;

// Load and zero rightmost byte		// Load and zero rightmost byte
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLZRGF$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLZRGF$")>;

// Load and trap		// Load and trap
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "LLG(F\|T)?AT$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "LLG(F\|T)?AT$")>;
Show All 18 Lines

// Store multiple		// Store multiple
def : InstRW<[WLat1, LSU2, FXb3, GroupAlone], (instregex "STM(G\|H\|Y)?$")>;		def : InstRW<[WLat1, LSU2, FXb3, GroupAlone], (instregex "STM(G\|H\|Y)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Byte swaps		// Byte swaps
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LRV(G)?R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LRV(G)?R$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LRV(G\|H)?$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BRW, NormalGr], (instregex "LRV(G\|H)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STRV(G\|H)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STRV(G\|H)?$")>;
def : InstRW<[WLat30, MCD], (instregex "MVCIN$")>;		def : InstRW<[WLat30, MCD], (instregex "MVCIN$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Load address instructions		// Load address instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LA(Y\|RL)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LA(Y)?$")>;
		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LARL?$")>;

// Load the Global Offset Table address ( -> larl )		// Load the Global Offset Table address ( -> larl )
def : InstRW<[WLat1, FXa, NormalGr], (instregex "GOT$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "GOT$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Absolute and Negation		// Absolute and Negation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, WLat1, FXa, NormalGr], (instregex "LP(G)?R$")>;		def : InstRW<[WLat1, WLat1, FXa, B2BRW, NormalGr], (instregex "LP(G)?R$")>;
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "L(N\|P)GFR$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "L(N\|P)GFR$")>;
def : InstRW<[WLat1, WLat1, FXa, NormalGr], (instregex "LN(R\|GR)$")>;		def : InstRW<[WLat1, WLat1, FXa, B2BRW, NormalGr], (instregex "LN(R\|GR)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LC(R\|GR)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LC(R\|GR)$")>;
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "LCGFR$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "LCGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Insertion		// Insertion
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "IC(Y)?$")>;		def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		(instregex "IC(Y)?$")>;
		def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "IC32(Y)?$")>;		(instregex "IC32(Y)?$")>;
def : InstRW<[WLat1LSU, RegReadAdv, WLat1LSU, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, RegReadAdv, WLat1LSU, FXa, LSU, B2BRW, NormalGr],
(instregex "ICM(H\|Y)?$")>;		(instregex "ICM(H\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "II(F\|H\|L)Mux$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IIFMux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "II(H\|L)Mux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IILH(64)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IILL(64)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Addition		// Addition
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "A(Y)?$")>;		(instregex "A(Y)?$")>;
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AH(Y)?$")>;		(instregex "AH(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AIH$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AIH$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AFI(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AFI(Mux)?$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AG$")>;		(instregex "AG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGFI$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGFI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGHI(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGHI(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AHI(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AHI(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AHIMux(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AHIMux(K)?$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AL(Y)?$")>;		(instregex "AL(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AL(FI\|HSIK)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AL(FI\|HSIK)$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "ALG(F)?$")>;		(instregex "ALG(F)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGHSIK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGHSIK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGF(I\|R)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGF(I\|R)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "A(L)?HHHR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "A(L)?HHHR$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "A(L)?HHLR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "A(L)?HHLR$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALSIH(N)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALSIH(N)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "A(L)?(G)?SI$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "A(L)?(G)?SI$")>;

// Logical addition with carry		// Logical addition with carry
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, GroupAlone],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, GroupAlone],
(instregex "ALC(G)?$")>;		(instregex "ALC(G)?$")>;
def : InstRW<[WLat2, WLat2, FXa, GroupAlone], (instregex "ALC(G)?R$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BnW, GroupAlone], (instregex "ALC(G)?R$")>;

// Add with sign extension (16/32 -> 64)		// Add with sign extension (16/32 -> 64)
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AG(F\|H)$")>;		(instregex "AG(F\|H)$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "AGFR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "AGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Subtraction		// Subtraction
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "S(G\|Y)?$")>;		(instregex "S(G\|Y)?$")>;
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SH(Y)?$")>;		(instregex "SH(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLFI$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLFI$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SL(G\|GF\|Y)?$")>;		(instregex "SL(G\|GF\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLGF(I\|R)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLGF(I\|R)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "S(L)?HHHR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "S(L)?HHHR$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "S(L)?HHLR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "S(L)?HHLR$")>;

// Subtraction with borrow		// Subtraction with borrow
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, GroupAlone],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, GroupAlone],
(instregex "SLB(G)?$")>;		(instregex "SLB(G)?$")>;
def : InstRW<[WLat2, WLat2, FXa, GroupAlone], (instregex "SLB(G)?R$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BnW, GroupAlone], (instregex "SLB(G)?R$")>;

// Subtraction with sign extension (16/32 -> 64)		// Subtraction with sign extension (16/32 -> 64)
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SG(F\|H)$")>;		(instregex "SG(F\|H)$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "SGFR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "SGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AND		// AND
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "N(G\|Y)?$")>;		(instregex "N(G\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NI(FMux\|HMux\|LMux)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NI(FMux\|HMux\|LMux)$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "NI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "NI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "NC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "NC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// OR		// OR
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "O(G\|Y)?$")>;		(instregex "O(G\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OGR(K)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "OI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "OI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OI(FMux\|HMux\|LMux)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OI(FMux\|HMux\|LMux)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "OC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "OC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// XOR		// XOR
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "X(G\|Y)?$")>;		(instregex "X(G\|Y)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "XI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "XI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XIFMux$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XIFMux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "XC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "XC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Multiplication		// Multiplication
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat5LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat5LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MS(GF\|Y)?$")>;		(instregex "MS(GF\|Y)?$")>;
def : InstRW<[WLat5, FXa, NormalGr], (instregex "MS(R\|FI)$")>;		def : InstRW<[WLat5, FXa, B2BRn, NormalGr], (instregex "MS(R\|FI)$")>;
def : InstRW<[WLat7LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "MSG$")>;		def : InstRW<[WLat7LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
def : InstRW<[WLat7, FXa, NormalGr], (instregex "MSGR$")>;		(instregex "MSG$")>;
def : InstRW<[WLat5, FXa, NormalGr], (instregex "MSGF(I\|R)$")>;		def : InstRW<[WLat7, FXa, B2BRn, NormalGr], (instregex "MSGR$")>;
def : InstRW<[WLat8LSU, RegReadAdv, FXa2, LSU, GroupAlone], (instregex "MLG$")>;		def : InstRW<[WLat5, FXa, B2BRn, NormalGr], (instregex "MSGF(I\|R)$")>;
def : InstRW<[WLat8, FXa2, GroupAlone], (instregex "MLGR$")>;		def : InstRW<[WLat8LSU, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
def : InstRW<[WLat4, FXa, NormalGr], (instregex "MGHI$")>;		(instregex "MLG$")>;
def : InstRW<[WLat4, FXa, NormalGr], (instregex "MHI$")>;		def : InstRW<[WLat8, FXa2, B2BRn, GroupAlone], (instregex "MLGR$")>;
def : InstRW<[WLat4LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "MH(Y)?$")>;		def : InstRW<[WLat4, FXa, B2BRW, NormalGr], (instregex "MGHI$")>;
def : InstRW<[WLat6, FXa2, GroupAlone], (instregex "M(L)?R$")>;		def : InstRW<[WLat4, FXa, B2BRW, NormalGr], (instregex "MHI$")>;
def : InstRW<[WLat6LSU, RegReadAdv, FXa2, LSU, GroupAlone],		def : InstRW<[WLat4LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
		(instregex "MH(Y)?$")>;
		def : InstRW<[WLat6, FXa2, B2BRn, GroupAlone], (instregex "M(L)?R$")>;
		def : InstRW<[WLat6LSU, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
(instregex "M(FY\|L)?$")>;		(instregex "M(FY\|L)?$")>;
def : InstRW<[WLat8, RegReadAdv, FXa, LSU, NormalGr], (instregex "MGH$")>;		def : InstRW<[WLat8, RegReadAdv, FXa, LSU, B2BRW, NormalGr], (instregex "MGH$")>;
def : InstRW<[WLat12, RegReadAdv, FXa2, LSU, GroupAlone], (instregex "MG$")>;		def : InstRW<[WLat12, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
def : InstRW<[WLat8, FXa2, GroupAlone], (instregex "MGRK$")>;		(instregex "MG$")>;
def : InstRW<[WLat6LSU, WLat6LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat8, FXa2, B2BRn, GroupAlone], (instregex "MGRK$")>;
		def : InstRW<[WLat6LSU, WLat6LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MSC$")>;		(instregex "MSC$")>;
def : InstRW<[WLat8LSU, WLat8LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat8LSU, WLat8LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MSGC$")>;		(instregex "MSGC$")>;
def : InstRW<[WLat6, WLat6, FXa, NormalGr], (instregex "MSRKC$")>;		def : InstRW<[WLat6, WLat6, FXa, B2BRn, NormalGr], (instregex "MSRKC$")>;
def : InstRW<[WLat8, WLat8, FXa, NormalGr], (instregex "MSGRKC$")>;		def : InstRW<[WLat8, WLat8, FXa, B2BRn, NormalGr], (instregex "MSGRKC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Division and remainder		// Division and remainder
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat20, FXa4, GroupAlone], (instregex "DR$")>;		def : InstRW<[WLat20, FXa4, B2BRn, GroupAlone], (instregex "DR$")>;
def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, GroupAlone2], (instregex "D$")>;		def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, B2BRn, GroupAlone2],
def : InstRW<[WLat30, FXa2, GroupAlone], (instregex "DSG(F)?R$")>;		(instregex "D$")>;
def : InstRW<[WLat30, RegReadAdv, FXa2, LSU, GroupAlone2],		def : InstRW<[WLat30, FXa2, B2BRn, GroupAlone], (instregex "DSG(F)?R$")>;
		def : InstRW<[WLat30, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone2],
(instregex "DSG(F)?$")>;		(instregex "DSG(F)?$")>;
def : InstRW<[WLat20, FXa4, GroupAlone], (instregex "DLR$")>;		def : InstRW<[WLat20, FXa4, B2BRn, GroupAlone], (instregex "DLR$")>;
def : InstRW<[WLat30, FXa4, GroupAlone], (instregex "DLGR$")>;		def : InstRW<[WLat30, FXa4, B2BRn, GroupAlone], (instregex "DLGR$")>;
def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, GroupAlone2],		def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, B2BRn, GroupAlone2],
(instregex "DL(G)?$")>;		(instregex "DL(G)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Shifts		// Shifts
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLL(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLL(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SRL(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SRL(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SRA(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SRA(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLA(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLA(G\|K)?$")>;
def : InstRW<[WLat5LSU, WLat5LSU, FXa4, LSU, GroupAlone2],		def : InstRW<[WLat5LSU, WLat5LSU, FXa4, LSU, B2BRW, GroupAlone2],
(instregex "S(L\|R)D(A\|L)$")>;		(instregex "S(L\|R)D(A\|L)$")>;

// Rotate		// Rotate
def : InstRW<[WLat2LSU, FXa, LSU, NormalGr], (instregex "RLL(G)?$")>;		def : InstRW<[WLat2LSU, FXa, LSU, B2BRW, NormalGr], (instregex "RLL(G)?$")>;

// Rotate and insert		// Rotate and insert
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBG(N\|32)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBG(N\|32)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBH(G\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBH(G\|H\|L)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBL(G\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBL(G\|H\|L)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBMux$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBMux$")>;

// Rotate and Select		// Rotate and Select
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "R(N\|O\|X)SBG$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "R(N\|O\|X)SBG$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Comparison		// Comparison
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, RegReadAdv, FXb, LSU, NormalGr],		def : InstRW<[WLat1LSU, RegReadAdv, FXb, LSU, NormalGr],
(instregex "C(G\|Y\|Mux)?$")>;		(instregex "C(G\|Y\|Mux)?$")>;
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "CRL$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "CRL$")>;
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
// Test and set		// Test and set
def : InstRW<[WLat2LSU, FXb, LSU, EndGroup], (instregex "TS$")>;		def : InstRW<[WLat2LSU, FXb, LSU, EndGroup], (instregex "TS$")>;

// Compare and swap		// Compare and swap
def : InstRW<[WLat3LSU, WLat3LSU, FXa, FXb, LSU, GroupAlone],		def : InstRW<[WLat3LSU, WLat3LSU, FXa, FXb, LSU, GroupAlone],
(instregex "CS(G\|Y)?$")>;		(instregex "CS(G\|Y)?$")>;

// Compare double and swap		// Compare double and swap
def : InstRW<[WLat6LSU, WLat6LSU, FXa3, FXb2, LSU, GroupAlone2],		def : InstRW<[WLat6LSU, WLat6LSU, FXa3, FXb2, LSU, B2BnW, GroupAlone2],
(instregex "CDS(Y)?$")>;		(instregex "CDS(Y)?$")>;
def : InstRW<[WLat15, WLat15, FXa2, FXb4, LSU3,		def : InstRW<[WLat15, WLat15, FXa2, FXb4, LSU3,
GroupAlone3], (instregex "CDSG$")>;		GroupAlone3], (instregex "CDSG$")>;

// Compare and swap and store		// Compare and swap and store
def : InstRW<[WLat30, MCD], (instregex "CSST$")>;		def : InstRW<[WLat30, MCD], (instregex "CSST$")>;

// Perform locked operation		// Perform locked operation
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
def : InstRW<[WLat20, WLat20, LSU5, GroupAlone], (instregex "LAM(Y)?$")>;		def : InstRW<[WLat20, WLat20, LSU5, GroupAlone], (instregex "LAM(Y)?$")>;
def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STAM(Y)?$")>;		def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STAM(Y)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Program mask and addressing mode		// Program mask and addressing mode
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Insert Program Mask		// Insert Program Mask
def : InstRW<[WLat3, FXa, EndGroup], (instregex "IPM$")>;		def : InstRW<[WLat3, FXa, B2BnW, EndGroup], (instregex "IPM$")>;

// Set Program Mask		// Set Program Mask
def : InstRW<[WLat3, LSU, EndGroup], (instregex "SPM$")>;		def : InstRW<[WLat3, LSU, EndGroup], (instregex "SPM$")>;

// Branch and link		// Branch and link
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "BAL(R)?$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BnW, GroupAlone], (instregex "BAL(R)?$")>;

// Test addressing mode		// Test addressing mode
def : InstRW<[WLat1, FXb, NormalGr], (instregex "TAM$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "TAM$")>;

// Set addressing mode		// Set addressing mode
def : InstRW<[WLat1, FXb, EndGroup], (instregex "SAM(24\|31\|64)$")>;		def : InstRW<[WLat1, FXb, EndGroup], (instregex "SAM(24\|31\|64)$")>;

// Branch (and save) and set mode.		// Branch (and save) and set mode.
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BSM$")>;		def : InstRW<[WLat1, FXa, FXb, B2BRW, GroupAlone], (instregex "BSM$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "BASSM$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone], (instregex "BASSM$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Transactional execution		// Transactional execution
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Transaction begin		// Transaction begin
def : InstRW<[WLat9, LSU2, FXb5, GroupAlone2], (instregex "TBEGIN(C)?$")>;		def : InstRW<[WLat9, LSU2, FXb5, GroupAlone2], (instregex "TBEGIN(C)?$")>;

Show All 15 Lines

def : InstRW<[WLat1, FXb, GroupAlone], (instregex "PPA$")>;		def : InstRW<[WLat1, FXb, GroupAlone], (instregex "PPA$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Miscellaneous Instructions.		// Miscellaneous Instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Find leftmost one		// Find leftmost one
def : InstRW<[WLat5, WLat5, FXa2, GroupAlone], (instregex "FLOGR$")>;		def : InstRW<[WLat5, WLat5, FXa2, B2BRn, GroupAlone], (instregex "FLOGR$")>;

// Population count		// Population count
def : InstRW<[WLat3, WLat3, FXa, NormalGr], (instregex "POPCNT$")>;		def : InstRW<[WLat3, WLat3, FXa, B2BRW, NormalGr], (instregex "POPCNT$")>;

// String instructions		// String instructions
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "SRST(U)?$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "SRST(U)?$")>;
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "CUSE$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "CUSE$")>;

// Various complex instructions		// Various complex instructions
def : InstRW<[WLat30, WLat30, WLat30, WLat30, MCD], (instregex "CFC$")>;		def : InstRW<[WLat30, WLat30, WLat30, WLat30, MCD], (instregex "CFC$")>;
def : InstRW<[WLat30, WLat30, WLat30, WLat30, WLat30, WLat30, MCD],		def : InstRW<[WLat30, WLat30, WLat30, WLat30, WLat30, WLat30, MCD],
▲ Show 20 Lines • Show All 723 Lines • ▼ Show 20 Lines
// -------------------------------- System ---------------------------------- //		// -------------------------------- System ---------------------------------- //

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// System: Program-Status Word Instructions		// System: Program-Status Word Instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat30, WLat30, MCD], (instregex "EPSW$")>;		def : InstRW<[WLat30, WLat30, MCD], (instregex "EPSW$")>;
def : InstRW<[WLat20, GroupAlone3], (instregex "LPSW(E)?$")>;		def : InstRW<[WLat20, GroupAlone3], (instregex "LPSW(E)?$")>;
def : InstRW<[WLat3, FXa, GroupAlone], (instregex "IPK$")>;		def : InstRW<[WLat3, FXa, B2BRW, GroupAlone], (instregex "IPK$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SPKA$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SPKA$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SSM$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SSM$")>;
def : InstRW<[WLat1, FXb, LSU, GroupAlone], (instregex "ST(N\|O)SM$")>;		def : InstRW<[WLat1, FXb, LSU, GroupAlone], (instregex "ST(N\|O)SM$")>;
def : InstRW<[WLat3, FXa, NormalGr], (instregex "IAC$")>;		def : InstRW<[WLat3, FXa, B2BRW, NormalGr], (instregex "IAC$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SAC(F)?$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SAC(F)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// System: Control Register Instructions		// System: Control Register Instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat4LSU, WLat4LSU, LSU2, GroupAlone], (instregex "LCTL(G)?$")>;		def : InstRW<[WLat4LSU, WLat4LSU, LSU2, GroupAlone], (instregex "LCTL(G)?$")>;
def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STCT(L\|G)$")>;		def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STCT(L\|G)$")>;
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZScheduleZ15.td

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
// Execution units.		// Execution units.
def Z15_FXaUnit : ProcResource<2>;		def Z15_FXaUnit : ProcResource<2>;
def Z15_FXbUnit : ProcResource<2>;		def Z15_FXbUnit : ProcResource<2>;
def Z15_LSUnit : ProcResource<2>;		def Z15_LSUnit : ProcResource<2>;
def Z15_VecUnit : ProcResource<2>;		def Z15_VecUnit : ProcResource<2>;
def Z15_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ }		def Z15_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ }
def Z15_VBUnit : ProcResource<2>;		def Z15_VBUnit : ProcResource<2>;
def Z15_MCD : ProcResource<1>;		def Z15_MCD : ProcResource<1>;
		def Z15_B2BRW : ProcResource<6>; // Avoid becoming the critical resource
		def Z15_B2BRn : ProcResource<6>;
		def Z15_B2BnW : ProcResource<6>;

// Subtarget specific definitions of scheduling resources.		// Subtarget specific definitions of scheduling resources.
let NumMicroOps = 0 in {		let NumMicroOps = 0 in {
def : WriteRes<FXa, [Z15_FXaUnit]>;		def : WriteRes<FXa, [Z15_FXaUnit]>;
def : WriteRes<FXb, [Z15_FXbUnit]>;		def : WriteRes<FXb, [Z15_FXbUnit]>;
def : WriteRes<LSU, [Z15_LSUnit]>;		def : WriteRes<LSU, [Z15_LSUnit]>;
def : WriteRes<VecBF, [Z15_VecUnit]>;		def : WriteRes<VecBF, [Z15_VecUnit]>;
def : WriteRes<VecDF, [Z15_VecUnit]>;		def : WriteRes<VecDF, [Z15_VecUnit]>;
Show All 17 Lines	let NumMicroOps = 0 in {

def : WriteRes<VBU, [Z15_VBUnit]>; // Virtual Branching Unit		def : WriteRes<VBU, [Z15_VBUnit]>; // Virtual Branching Unit
}		}

def : WriteRes<MCD, [Z15_MCD]> { let NumMicroOps = 3;		def : WriteRes<MCD, [Z15_MCD]> { let NumMicroOps = 3;
let BeginGroup = 1;		let BeginGroup = 1;
let EndGroup = 1; }		let EndGroup = 1; }

		let NumMicroOps = 0 in {
		def : WriteRes<B2BRW, [Z15_B2BRW]> {}
		def : WriteRes<B2BRn, [Z15_B2BRn]> {}
		def : WriteRes<B2BnW, [Z15_B2BnW]> {}
		}

// -------------------------- INSTRUCTIONS ---------------------------------- //		// -------------------------- INSTRUCTIONS ---------------------------------- //

// InstRW constructs have been used in order to preserve the		// InstRW constructs have been used in order to preserve the
// readability of the InstrInfo files.		// readability of the InstrInfo files.

// For each instruction, as matched by a regexp, provide a list of		// For each instruction, as matched by a regexp, provide a list of
// resources that it needs. These will be combined into a SchedClass.		// resources that it needs. These will be combined into a SchedClass.

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Stack allocation		// Stack allocation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Pseudo -> LA / LAY		// Pseudo -> LA / LAY
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ADJDYNALLOC$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ADJDYNALLOC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Branch instructions		// Branch instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Branch		// Branch
def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?BRC(L)?(Asm.*)?$")>;		def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?BRC(L)?(Asm.*)?$")>;
def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?J(G)?(Asm.*)?$")>;		def : InstRW<[WLat1, VBU, NormalGr], (instregex "(Call)?J(G)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?BC(R)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?BC(R)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?B(R)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "(Call)?B(R)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "BI(C)?(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "BI(C)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXa, EndGroup], (instregex "BRCT(G)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, EndGroup], (instregex "BRCT(G)?$")>;
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BRCTH$")>;		def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BRCTH$")>;
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BCT(G)?(R)?$")>;		def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BCT(G)?(R)?$")>;
def : InstRW<[WLat1, FXa2, FXb2, GroupAlone2],		def : InstRW<[WLat1, FXa2, FXb2, GroupAlone2],
(instregex "B(R)?X(H\|L).*$")>;		(instregex "B(R)?X(H\|L).*$")>;

// Compare and branch		// Compare and branch
def : InstRW<[WLat1, FXb, NormalGr], (instregex "C(L)?(G)?(I\|R)J(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "C(L)?(G)?(I\|R)J(Asm.*)?$")>;
def : InstRW<[WLat1, FXb2, GroupAlone],		def : InstRW<[WLat1, FXb2, GroupAlone],
Show All 12 Lines
def : InstRW<[WLat1, FXb, NormalGr], (instregex "CL(F\|G)IT(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "CL(F\|G)IT(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "CL(G)?T(Asm.*)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "CL(G)?T(Asm.*)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Call and return instructions		// Call and return instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Call		// Call
def : InstRW<[WLat1, VBU, FXa2, GroupAlone], (instregex "(Call)?BRAS$")>;		def : InstRW<[WLat1, VBU, FXa2, B2BRW, GroupAlone], (instregex "(Call)?BRAS$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "(Call)?BRASL$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone], (instregex "(Call)?BRASL$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "(Call)?BAS(R)?$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone],
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;		(instregex "(Call)?BAS(R)?$")>;
		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone],
		(instregex "TLS_(G\|L)DCALL$")>;

// Return		// Return
def : InstRW<[WLat1, FXb, EndGroup], (instregex "Return$")>;		def : InstRW<[WLat1, FXb, EndGroup], (instregex "Return$")>;
def : InstRW<[WLat1, FXb, NormalGr], (instregex "CondReturn$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "CondReturn$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Move instructions		// Move instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Moves		// Moves
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MV(G\|H)?HI$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MV(G\|H)?HI$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MVI(Y)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "MVI(Y)?$")>;

// Move character		// Move character
def : InstRW<[WLat1, FXb, LSU3, GroupAlone], (instregex "MVC$")>;		def : InstRW<[WLat1, FXb, LSU3, GroupAlone], (instregex "MVC$")>;
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVCL(E\|U)?$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVCL(E\|U)?$")>;
def : InstRW<[WLat1, LSU2, GroupAlone], (instregex "MVCRL$")>;		def : InstRW<[WLat1, LSU2, GroupAlone], (instregex "MVCRL$")>;

// Pseudo -> reg move		// Pseudo -> reg move
def : InstRW<[WLat1, FXa, NormalGr], (instregex "COPY(_TO_REGCLASS)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "COPY(_TO_REGCLASS)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "EXTRACT_SUBREG$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "EXTRACT_SUBREG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "INSERT_SUBREG$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "INSERT_SUBREG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "REG_SEQUENCE$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "REG_SEQUENCE$")>;

// Loads		// Loads
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L(Y\|FH\|RL\|Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
def : InstRW<[LSULatency, LSULatency, LSU, NormalGr], (instregex "LCBB$")>;		def : InstRW<[LSULatency, LSULatency, LSU, NormalGr], (instregex "LCBB$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LG(RL)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LG(RL)?$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L128$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "L128$")>;

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLIH(F\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LLI(H\|L)F$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLIL(F\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLI(H\|L)(H\|L)$")>;

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LG(F\|H)I$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LGFI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LHI(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LGHI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LHI(Mux)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LR$")>;

// Load and zero rightmost byte		// Load and zero rightmost byte
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LZR(F\|G)$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LZR(F\|G)$")>;

// Load and trap		// Load and trap
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "L(FH\|G)?AT$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "L(FH\|G)?AT$")>;

// Load and test		// Load and test
def : InstRW<[WLat1LSU, WLat1LSU, LSU, FXa, NormalGr], (instregex "LT(G)?$")>;		def : InstRW<[WLat1LSU, WLat1LSU, LSU, FXa, B2BRW, NormalGr],
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LT(G)?R$")>;		(instregex "LT(G)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LT(G)?R$")>;

// Stores		// Stores
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STG(RL)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STG(RL)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST128$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST128$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;

// String moves.		// String moves.
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVST$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "MVST$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Conditional move instructions		// Conditional move instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOCRMux$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "LOCRMux$")>;
def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOC(G\|FH)?R(Asm.*)?$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "LOC(G\|FH)?R(Asm.*)?$")>;
def : InstRW<[WLat2, FXa, NormalGr], (instregex "LOC(G\|H)?HI(Mux\|(Asm.*))?$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr],
def : InstRW<[WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		(instregex "LOC(G\|H)?HI(Mux\|(Asm.*))?$")>;
		def : InstRW<[WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, NormalGr],
(instregex "LOC(G\|FH\|Mux)?(Asm.*)?$")>;		(instregex "LOC(G\|FH\|Mux)?(Asm.*)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr],		def : InstRW<[WLat1, FXb, LSU, NormalGr],
(instregex "STOC(G\|FH\|Mux)?(Asm.*)?$")>;		(instregex "STOC(G\|FH\|Mux)?(Asm.*)?$")>;

def : InstRW<[WLat2, FXa, NormalGr], (instregex "SELRMux$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "SELRMux$")>;
def : InstRW<[WLat2, FXa, NormalGr], (instregex "SEL(G\|FH)?R(Asm.*)?$")>;		def : InstRW<[WLat2, FXa, B2BnW, NormalGr], (instregex "SEL(G\|FH)?R(Asm.*)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sign extensions		// Sign extensions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "L(B\|H\|G)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "L(B\|H\|G)R$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LG(B\|H\|F)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LG(B\|H\|F)R$")>;

def : InstRW<[WLat1LSU, WLat1LSU, FXa, LSU, NormalGr], (instregex "LTGF$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LTGFR$")>;

def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LB(H\|Mux)?$")>;		def : InstRW<[WLat1LSU, WLat1LSU, FXa, LSU, B2BnW, NormalGr],
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LH(Y)?$")>;		(instregex "LTGF$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LH(H\|Mux\|RL)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LTGFR$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LG(B\|H\|F)$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LG(H\|F)RL$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LB(H\|Mux)?$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LH(Y)?$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LH(H\|Mux\|RL)$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LGB$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BRW, NormalGr], (instregex "LG(H\|F)$")>;
		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LG(H\|F)RL$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Zero extensions		// Zero extensions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLCR(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLCR(Mux)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLHR(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLHR(Mux)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LLG(C\|H\|F\|T)R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LLG(C\|H\|F\|T)R$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLC(Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLC(Mux)?$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLH(Mux)?$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLH(Mux)?$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LL(C\|H)H$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BnW, NormalGr], (instregex "LL(C\|H)H$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLHRL$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLHRL$")>;
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLG(C\|H\|F\|T\|HRL\|FRL)$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLG(C\|H\|F\|T\|HRL\|FRL)$")>;

// Load and zero rightmost byte		// Load and zero rightmost byte
def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLZRGF$")>;		def : InstRW<[LSULatency, LSU, NormalGr], (instregex "LLZRGF$")>;

// Load and trap		// Load and trap
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "LLG(F\|T)?AT$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "LLG(F\|T)?AT$")>;
Show All 18 Lines

// Store multiple		// Store multiple
def : InstRW<[WLat1, LSU2, FXb3, GroupAlone], (instregex "STM(G\|H\|Y)?$")>;		def : InstRW<[WLat1, LSU2, FXb3, GroupAlone], (instregex "STM(G\|H\|Y)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Byte swaps		// Byte swaps
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LRV(G)?R$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LRV(G)?R$")>;
def : InstRW<[WLat1LSU, FXa, LSU, NormalGr], (instregex "LRV(G\|H)?$")>;		def : InstRW<[WLat1LSU, FXa, LSU, B2BRW, NormalGr], (instregex "LRV(G\|H)?$")>;
def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STRV(G\|H)?$")>;		def : InstRW<[WLat1, FXb, LSU, NormalGr], (instregex "STRV(G\|H)?$")>;
def : InstRW<[WLat30, MCD], (instregex "MVCIN$")>;		def : InstRW<[WLat30, MCD], (instregex "MVCIN$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Load address instructions		// Load address instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "LA(Y\|RL)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LA(Y)?$")>;
		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "LARL?$")>;

// Load the Global Offset Table address ( -> larl )		// Load the Global Offset Table address ( -> larl )
def : InstRW<[WLat1, FXa, NormalGr], (instregex "GOT$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "GOT$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Absolute and Negation		// Absolute and Negation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, WLat1, FXa, NormalGr], (instregex "LP(G)?R$")>;		def : InstRW<[WLat1, WLat1, FXa, B2BRW, NormalGr], (instregex "LP(G)?R$")>;
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "L(N\|P)GFR$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "L(N\|P)GFR$")>;
def : InstRW<[WLat1, WLat1, FXa, NormalGr], (instregex "LN(R\|GR)$")>;		def : InstRW<[WLat1, WLat1, FXa, B2BRW, NormalGr], (instregex "LN(R\|GR)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "LC(R\|GR)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "LC(R\|GR)$")>;
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "LCGFR$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "LCGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Insertion		// Insertion
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "IC(Y)?$")>;		def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		(instregex "IC(Y)?$")>;
		def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "IC32(Y)?$")>;		(instregex "IC32(Y)?$")>;
def : InstRW<[WLat1LSU, RegReadAdv, WLat1LSU, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, RegReadAdv, WLat1LSU, FXa, LSU, B2BRW, NormalGr],
(instregex "ICM(H\|Y)?$")>;		(instregex "ICM(H\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "II(F\|H\|L)Mux$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IIFMux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "II(H\|L)Mux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BnW, NormalGr], (instregex "IILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "IILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IILH(64)?$")>;
		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "IILL(64)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Addition		// Addition
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "A(Y)?$")>;		(instregex "A(Y)?$")>;
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AH(Y)?$")>;		(instregex "AH(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AIH$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AIH$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AFI(Mux)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AFI(Mux)?$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AG$")>;		(instregex "AG$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGFI$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGFI$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGHI(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGHI(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AHI(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AHI(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AHIMux(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AHIMux(K)?$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AL(Y)?$")>;		(instregex "AL(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AL(FI\|HSIK)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AL(FI\|HSIK)$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "ALG(F)?$")>;		(instregex "ALG(F)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGHSIK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGHSIK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGF(I\|R)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGF(I\|R)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "AR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "AR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "A(L)?HHHR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "A(L)?HHHR$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "A(L)?HHLR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "A(L)?HHLR$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "ALSIH(N)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "ALSIH(N)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "A(L)?(G)?SI$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "A(L)?(G)?SI$")>;

// Logical addition with carry		// Logical addition with carry
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, GroupAlone],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, GroupAlone],
(instregex "ALC(G)?$")>;		(instregex "ALC(G)?$")>;
def : InstRW<[WLat2, WLat2, FXa, GroupAlone], (instregex "ALC(G)?R$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BnW, GroupAlone], (instregex "ALC(G)?R$")>;

// Add with sign extension (16/32 -> 64)		// Add with sign extension (16/32 -> 64)
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "AG(F\|H)$")>;		(instregex "AG(F\|H)$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "AGFR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "AGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Subtraction		// Subtraction
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "S(G\|Y)?$")>;		(instregex "S(G\|Y)?$")>;
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SH(Y)?$")>;		(instregex "SH(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLFI$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLFI$")>;
def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SL(G\|GF\|Y)?$")>;		(instregex "SL(G\|GF\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLGF(I\|R)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLGF(I\|R)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "S(L)?HHHR$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "S(L)?HHHR$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "S(L)?HHLR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "S(L)?HHLR$")>;

// Subtraction with borrow		// Subtraction with borrow
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, GroupAlone],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BnW, GroupAlone],
(instregex "SLB(G)?$")>;		(instregex "SLB(G)?$")>;
def : InstRW<[WLat2, WLat2, FXa, GroupAlone], (instregex "SLB(G)?R$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BnW, GroupAlone], (instregex "SLB(G)?R$")>;

// Subtraction with sign extension (16/32 -> 64)		// Subtraction with sign extension (16/32 -> 64)
def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat2LSU, WLat2LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "SG(F\|H)$")>;		(instregex "SG(F\|H)$")>;
def : InstRW<[WLat2, WLat2, FXa, NormalGr], (instregex "SGFR$")>;		def : InstRW<[WLat2, WLat2, FXa, B2BRW, NormalGr], (instregex "SGFR$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AND		// AND
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "N(G\|Y)?$")>;		(instregex "N(G\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NI(FMux\|HMux\|LMux)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NI(FMux\|HMux\|LMux)$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "NI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "NI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NILL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "NC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "NC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// OR		// OR
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "O(G\|Y)?$")>;		(instregex "O(G\|Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OGR(K)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "OI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "OI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OI(FMux\|HMux\|LMux)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OI(FMux\|HMux\|LMux)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OIHL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OIHL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILH(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILH(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OILL(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OILL(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "OC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "OC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// XOR		// XOR
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat1LSU, WLat1LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
(instregex "X(G\|Y)?$")>;		(instregex "X(G\|Y)?$")>;
def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "XI(Y)?$")>;		def : InstRW<[WLat2LSU, FXb, LSU, NormalGr], (instregex "XI(Y)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XIFMux$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XIFMux$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XGR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XGR(K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XIHF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XIHF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XILF(64)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XILF(64)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "XR(K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "XR(K)?$")>;
def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "XC$")>;		def : InstRW<[WLat3LSU, LSU2, FXb, Cracked], (instregex "XC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Combined logical operations		// Combined logical operations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "NC(G)?RK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NC(G)?RK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "OC(G)?RK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "OC(G)?RK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NN(G)?RK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NN(G)?RK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NO(G)?RK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NO(G)?RK$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "NX(G)?RK$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "NX(G)?RK$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Multiplication		// Multiplication
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat5LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat5LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MS(GF\|Y)?$")>;		(instregex "MS(GF\|Y)?$")>;
def : InstRW<[WLat5, FXa, NormalGr], (instregex "MS(R\|FI)$")>;		def : InstRW<[WLat5, FXa, B2BRn, NormalGr], (instregex "MS(R\|FI)$")>;
def : InstRW<[WLat7LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "MSG$")>;		def : InstRW<[WLat7LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
def : InstRW<[WLat7, FXa, NormalGr], (instregex "MSGR$")>;		(instregex "MSG$")>;
def : InstRW<[WLat5, FXa, NormalGr], (instregex "MSGF(I\|R)$")>;		def : InstRW<[WLat7, FXa, B2BRn, NormalGr], (instregex "MSGR$")>;
def : InstRW<[WLat8LSU, RegReadAdv, FXa2, LSU, GroupAlone], (instregex "MLG$")>;		def : InstRW<[WLat5, FXa, B2BRn, NormalGr], (instregex "MSGF(I\|R)$")>;
def : InstRW<[WLat8, FXa2, GroupAlone], (instregex "MLGR$")>;		def : InstRW<[WLat8LSU, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
def : InstRW<[WLat4, FXa, NormalGr], (instregex "MGHI$")>;		(instregex "MLG$")>;
def : InstRW<[WLat4, FXa, NormalGr], (instregex "MHI$")>;		def : InstRW<[WLat8, FXa2, B2BRn, GroupAlone], (instregex "MLGR$")>;
def : InstRW<[WLat4LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "MH(Y)?$")>;		def : InstRW<[WLat4, FXa, B2BRW, NormalGr], (instregex "MGHI$")>;
def : InstRW<[WLat6, FXa2, GroupAlone], (instregex "M(L)?R$")>;		def : InstRW<[WLat4, FXa, B2BRW, NormalGr], (instregex "MHI$")>;
def : InstRW<[WLat6LSU, RegReadAdv, FXa2, LSU, GroupAlone],		def : InstRW<[WLat4LSU, RegReadAdv, FXa, LSU, B2BRW, NormalGr],
		(instregex "MH(Y)?$")>;
		def : InstRW<[WLat6, FXa2, B2BRn, GroupAlone], (instregex "M(L)?R$")>;
		def : InstRW<[WLat6LSU, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
(instregex "M(FY\|L)?$")>;		(instregex "M(FY\|L)?$")>;
def : InstRW<[WLat8, RegReadAdv, FXa, LSU, NormalGr], (instregex "MGH$")>;		def : InstRW<[WLat8, RegReadAdv, FXa, LSU, B2BRW, NormalGr], (instregex "MGH$")>;
def : InstRW<[WLat12, RegReadAdv, FXa2, LSU, GroupAlone], (instregex "MG$")>;		def : InstRW<[WLat12, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone],
def : InstRW<[WLat8, FXa2, GroupAlone], (instregex "MGRK$")>;		(instregex "MG$")>;
def : InstRW<[WLat6LSU, WLat6LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat8, FXa2, B2BRn, GroupAlone], (instregex "MGRK$")>;
		def : InstRW<[WLat6LSU, WLat6LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MSC$")>;		(instregex "MSC$")>;
def : InstRW<[WLat8LSU, WLat8LSU, RegReadAdv, FXa, LSU, NormalGr],		def : InstRW<[WLat8LSU, WLat8LSU, RegReadAdv, FXa, LSU, B2BRn, NormalGr],
(instregex "MSGC$")>;		(instregex "MSGC$")>;
def : InstRW<[WLat6, WLat6, FXa, NormalGr], (instregex "MSRKC$")>;		def : InstRW<[WLat6, WLat6, FXa, B2BRn, NormalGr], (instregex "MSRKC$")>;
def : InstRW<[WLat8, WLat8, FXa, NormalGr], (instregex "MSGRKC$")>;		def : InstRW<[WLat8, WLat8, FXa, B2BRn, NormalGr], (instregex "MSGRKC$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Division and remainder		// Division and remainder
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat20, FXa4, GroupAlone], (instregex "DR$")>;		def : InstRW<[WLat20, FXa4, B2BRn, GroupAlone], (instregex "DR$")>;
def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, GroupAlone2], (instregex "D$")>;		def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, B2BRn, GroupAlone2],
def : InstRW<[WLat30, FXa2, GroupAlone], (instregex "DSG(F)?R$")>;		(instregex "D$")>;
def : InstRW<[WLat30, RegReadAdv, FXa2, LSU, GroupAlone2],		def : InstRW<[WLat30, FXa2, B2BRn, GroupAlone], (instregex "DSG(F)?R$")>;
		def : InstRW<[WLat30, RegReadAdv, FXa2, LSU, B2BRn, GroupAlone2],
(instregex "DSG(F)?$")>;		(instregex "DSG(F)?$")>;
def : InstRW<[WLat20, FXa4, GroupAlone], (instregex "DLR$")>;		def : InstRW<[WLat20, FXa4, B2BRn, GroupAlone], (instregex "DLR$")>;
def : InstRW<[WLat30, FXa4, GroupAlone], (instregex "DLGR$")>;		def : InstRW<[WLat30, FXa4, B2BRn, GroupAlone], (instregex "DLGR$")>;
def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, GroupAlone2],		def : InstRW<[WLat30, RegReadAdv, FXa4, LSU, B2BRn, GroupAlone2],
(instregex "DL(G)?$")>;		(instregex "DL(G)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Shifts		// Shifts
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLL(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLL(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SRL(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SRL(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SRA(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SRA(G\|K)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "SLA(G\|K)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "SLA(G\|K)?$")>;
def : InstRW<[WLat5LSU, WLat5LSU, FXa4, LSU, GroupAlone2],		def : InstRW<[WLat5LSU, WLat5LSU, FXa4, LSU, B2BRW, GroupAlone2],
(instregex "S(L\|R)D(A\|L)$")>;		(instregex "S(L\|R)D(A\|L)$")>;

// Rotate		// Rotate
def : InstRW<[WLat2LSU, FXa, LSU, NormalGr], (instregex "RLL(G)?$")>;		def : InstRW<[WLat2LSU, FXa, LSU, B2BRW, NormalGr], (instregex "RLL(G)?$")>;

// Rotate and insert		// Rotate and insert
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBG(N\|32)?$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBG(N\|32)?$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBH(G\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBH(G\|H\|L)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBL(G\|H\|L)$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBL(G\|H\|L)$")>;
def : InstRW<[WLat1, FXa, NormalGr], (instregex "RISBMux$")>;		def : InstRW<[WLat1, FXa, B2BRW, NormalGr], (instregex "RISBMux$")>;

// Rotate and Select		// Rotate and Select
def : InstRW<[WLat2, WLat2, FXa2, Cracked], (instregex "R(N\|O\|X)SBG$")>;		def : InstRW<[WLat2, WLat2, FXa2, B2BRW, Cracked], (instregex "R(N\|O\|X)SBG$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Comparison		// Comparison
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat1LSU, RegReadAdv, FXb, LSU, NormalGr],		def : InstRW<[WLat1LSU, RegReadAdv, FXb, LSU, NormalGr],
(instregex "C(G\|Y\|Mux)?$")>;		(instregex "C(G\|Y\|Mux)?$")>;
def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "CRL$")>;		def : InstRW<[WLat1LSU, FXb, LSU, NormalGr], (instregex "CRL$")>;
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
// Test and set		// Test and set
def : InstRW<[WLat2LSU, FXb, LSU, EndGroup], (instregex "TS$")>;		def : InstRW<[WLat2LSU, FXb, LSU, EndGroup], (instregex "TS$")>;

// Compare and swap		// Compare and swap
def : InstRW<[WLat3LSU, WLat3LSU, FXa, FXb, LSU, GroupAlone],		def : InstRW<[WLat3LSU, WLat3LSU, FXa, FXb, LSU, GroupAlone],
(instregex "CS(G\|Y)?$")>;		(instregex "CS(G\|Y)?$")>;

// Compare double and swap		// Compare double and swap
def : InstRW<[WLat6LSU, WLat6LSU, FXa3, FXb2, LSU, GroupAlone2],		def : InstRW<[WLat6LSU, WLat6LSU, FXa3, FXb2, LSU, B2BnW, GroupAlone2],
(instregex "CDS(Y)?$")>;		(instregex "CDS(Y)?$")>;
def : InstRW<[WLat15, WLat15, FXa2, FXb4, LSU3,		def : InstRW<[WLat15, WLat15, FXa2, FXb4, LSU3,
GroupAlone3], (instregex "CDSG$")>;		GroupAlone3], (instregex "CDSG$")>;

// Compare and swap and store		// Compare and swap and store
def : InstRW<[WLat30, MCD], (instregex "CSST$")>;		def : InstRW<[WLat30, MCD], (instregex "CSST$")>;

// Perform locked operation		// Perform locked operation
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
def : InstRW<[WLat20, WLat20, LSU5, GroupAlone], (instregex "LAM(Y)?$")>;		def : InstRW<[WLat20, WLat20, LSU5, GroupAlone], (instregex "LAM(Y)?$")>;
def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STAM(Y)?$")>;		def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STAM(Y)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Program mask and addressing mode		// Program mask and addressing mode
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Insert Program Mask		// Insert Program Mask
def : InstRW<[WLat3, FXa, EndGroup], (instregex "IPM$")>;		def : InstRW<[WLat3, FXa, B2BnW, EndGroup], (instregex "IPM$")>;

// Set Program Mask		// Set Program Mask
def : InstRW<[WLat3, LSU, EndGroup], (instregex "SPM$")>;		def : InstRW<[WLat3, LSU, EndGroup], (instregex "SPM$")>;

// Branch and link		// Branch and link
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "BAL(R)?$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BnW, GroupAlone], (instregex "BAL(R)?$")>;

// Test addressing mode		// Test addressing mode
def : InstRW<[WLat1, FXb, NormalGr], (instregex "TAM$")>;		def : InstRW<[WLat1, FXb, NormalGr], (instregex "TAM$")>;

// Set addressing mode		// Set addressing mode
def : InstRW<[WLat1, FXb, EndGroup], (instregex "SAM(24\|31\|64)$")>;		def : InstRW<[WLat1, FXb, EndGroup], (instregex "SAM(24\|31\|64)$")>;

// Branch (and save) and set mode.		// Branch (and save) and set mode.
def : InstRW<[WLat1, FXa, FXb, GroupAlone], (instregex "BSM$")>;		def : InstRW<[WLat1, FXa, FXb, B2BRW, GroupAlone], (instregex "BSM$")>;
def : InstRW<[WLat1, FXa2, FXb, GroupAlone], (instregex "BASSM$")>;		def : InstRW<[WLat1, FXa2, FXb, B2BRW, GroupAlone], (instregex "BASSM$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Transactional execution		// Transactional execution
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Transaction begin		// Transaction begin
def : InstRW<[WLat9, LSU2, FXb5, GroupAlone2], (instregex "TBEGIN(C)?$")>;		def : InstRW<[WLat9, LSU2, FXb5, GroupAlone2], (instregex "TBEGIN(C)?$")>;

Show All 15 Lines

def : InstRW<[WLat1, FXb, GroupAlone], (instregex "PPA$")>;		def : InstRW<[WLat1, FXb, GroupAlone], (instregex "PPA$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Miscellaneous Instructions.		// Miscellaneous Instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Find leftmost one		// Find leftmost one
def : InstRW<[WLat5, WLat5, FXa2, GroupAlone], (instregex "FLOGR$")>;		def : InstRW<[WLat5, WLat5, FXa2, B2BRn, GroupAlone], (instregex "FLOGR$")>;

// Population count		// Population count
def : InstRW<[WLat3, WLat3, FXa, NormalGr], (instregex "POPCNT(Opt)?$")>;		def : InstRW<[WLat3, WLat3, FXa, B2BRW, NormalGr], (instregex "POPCNT(Opt)?$")>;

// String instructions		// String instructions
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "SRST(U)?$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "SRST(U)?$")>;
def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "CUSE$")>;		def : InstRW<[WLat30, WLat30, WLat30, MCD], (instregex "CUSE$")>;

// Various complex instructions		// Various complex instructions
def : InstRW<[WLat30, WLat30, WLat30, WLat30, MCD], (instregex "CFC$")>;		def : InstRW<[WLat30, WLat30, WLat30, WLat30, MCD], (instregex "CFC$")>;
def : InstRW<[WLat30, WLat30, WLat30, WLat30, WLat30, WLat30, MCD],		def : InstRW<[WLat30, WLat30, WLat30, WLat30, WLat30, WLat30, MCD],
▲ Show 20 Lines • Show All 754 Lines • ▼ Show 20 Lines
// -------------------------------- System ---------------------------------- //		// -------------------------------- System ---------------------------------- //

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// System: Program-Status Word Instructions		// System: Program-Status Word Instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat30, WLat30, MCD], (instregex "EPSW$")>;		def : InstRW<[WLat30, WLat30, MCD], (instregex "EPSW$")>;
def : InstRW<[WLat20, GroupAlone3], (instregex "LPSW(E)?$")>;		def : InstRW<[WLat20, GroupAlone3], (instregex "LPSW(E)?$")>;
def : InstRW<[WLat3, FXa, GroupAlone], (instregex "IPK$")>;		def : InstRW<[WLat3, FXa, B2BRW, GroupAlone], (instregex "IPK$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SPKA$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SPKA$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SSM$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SSM$")>;
def : InstRW<[WLat1, FXb, LSU, GroupAlone], (instregex "ST(N\|O)SM$")>;		def : InstRW<[WLat1, FXb, LSU, GroupAlone], (instregex "ST(N\|O)SM$")>;
def : InstRW<[WLat3, FXa, NormalGr], (instregex "IAC$")>;		def : InstRW<[WLat3, FXa, B2BRW, NormalGr], (instregex "IAC$")>;
def : InstRW<[WLat1, LSU, EndGroup], (instregex "SAC(F)?$")>;		def : InstRW<[WLat1, LSU, EndGroup], (instregex "SAC(F)?$")>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// System: Control Register Instructions		// System: Control Register Instructions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def : InstRW<[WLat4LSU, WLat4LSU, LSU2, GroupAlone], (instregex "LCTL(G)?$")>;		def : InstRW<[WLat4LSU, WLat4LSU, LSU2, GroupAlone], (instregex "LCTL(G)?$")>;
def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STCT(L\|G)$")>;		def : InstRW<[WLat1, LSU5, FXb, GroupAlone2], (instregex "STCT(L\|G)$")>;
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/postra-sched-sidesteer.mir

This file was added.

				# RUN_: llc -mtriple=s390x-linux-gnu -mcpu=z14 -run-pass=postmisched \
				# RUN_: -sidesteer-fxu -sidesteer-lastslot %s -o - 2>&1 \| FileCheck %s

				# Alternate method:
				# RUN: llc -mtriple=s390x-linux-gnu -mcpu=z14 -run-pass=postmisched \
				# RUN: -sidesteer-fxu -sidesteer-lastslot -sidesteer-lookahead %s -o - 2>&1 \| FileCheck %s

				# Test that the last use of r2 ends up on same side as the the def.
				---
				# CHECK: name: fun0
				# CHECK-LABEL: bb.0:
				# CHECK: $r2d = LGR killed $r0d
				# CHECK: $r3d = LGR $r2d
				# CHECK: $r4d = LGR $r2d
				# CHECK: $r6d = LGR $r5d
				# CHECK: $r7d = LGR $r5d
				# CHECK: $r8d = LGR killed $r5d
				# CHECK: $r0d = LGR $r2d
				# CHECK: Return implicit killed $r2d
				name: fun0
				machineFunctionInfo: {}
				body: \|
				bb.0:
				$r2d = LGR $r0d
				$r3d = LGR $r2d
				$r4d = LGR $r2d
				$r0d = LGR $r2d
				$r6d = LGR $r5d
				$r7d = LGR $r5d
				$r8d = LGR $r5d
				Return implicit $r2d
				...


				# Test that the r7 def is not put in the last slot of the decoder group given
				# the immediate user of it.
				---
				# CHECK: name: fun1
				# CHECK-LABEL: bb.0:
				# CHECK: $r2d = LGR killed $r0d
				# CHECK: $r2d = LGR killed $r2d
				# CHECK: $r4d = LGR killed $r3d
				# CHECK: $r7d = LGR killed $r1d
				# CHECK: $r8d = LGR killed $r7d
				# CHECK: $r6d = LGR killed $r5d
				# CHECK: Return implicit killed $r2d

				name: fun1
				machineFunctionInfo: {}
				body: \|
				bb.0:
				$r2d = LGR $r0d
				$r2d = LGR $r2d
				$r4d = LGR $r3d
				$r6d = LGR $r5d
				$r7d = LGR $r1d
				$r8d = LGR $r7d
				Return implicit $r2d
				...


				# Test that r4d is put early into decoder group given the immediate user.
				---
				# CHECK: name: fun2
				# CHECK-LABEL: bb.0:
				# CHECK: $r6d = LGR killed $r7d
				# CHECK: $r8d = LGR killed $r6d
				# CHECK: $r10d = MSGRKC killed $r8d, $r8d, implicit-def $cc
				# CHECK: $r4d = LGR killed $r3d
				# CHECK: $r5d = LGR killed $r4d
				# CHECK: $r2d = LGR killed $r10d
				# CHECK: Return implicit killed $r2d

				name: fun2
				machineFunctionInfo: {}
				body: \|
				bb.0:
				$r6d = LGR $r7d
				$r4d = LGR $r3d
				$r5d = LGR $r4d
				$r8d = LGR $r6d
				$r10d = MSGRKC $r8d, $r8d, implicit-def $cc
				$r2d = LGR $r10d
				Return implicit $r2d
				...


				# Test that r6 and r8 are put early into same decoder group as MSGRKC (MSGRKC has other predecessor).
				---
				# CHECK: name: fun3
				# CHECK-LABEL: bb.0:
				# CHECK: $r6d = LGR killed $r7d
				# CHECK: $r8d = LGR killed $r9d
				# CHECK: $r10d = MSGRKC killed $r6d, killed $r8d, implicit-def $cc
				# CHECK: $r4d = LGR killed $r3d
				# CHECK: $r5d = LGR killed $r4d
				# CHECK: Return implicit killed $r2d

				name: fun3
				machineFunctionInfo: {}
				body: \|
				bb.0:
				$r4d = LGR $r3d
				$r6d = LGR $r7d
				$r8d = LGR $r9d
				$r5d = LGR $r4d
				$r10d = MSGRKC $r6d, $r8d, implicit-def $cc
				Return implicit $r2d
				...