This is an archive of the discontinued LLVM Phabricator instance.

SystemZ scheduling implementation
ClosedPublic

Authored by jonpa on Feb 15 2016, 2:31 AM.

Download Raw Diff

Details

Reviewers

uweigand
atrick
hfinkel

Summary

General review and comments / suggestions would be greatly appreciated for this implementation of instruction scheduling for SystemZ.

It contains some changes outside the SystemZ backend, which will be addressed in separate revisions, but are included here also since they are part of this project.

There are some experimental parts left, which will not be commited, such as statistics and options.

Main points are:

post-ra mischeduling with a HazardRecognizer for decoder groups, and a custom SchedStrategy.
pre-ra mischeduling enabled.
Instruction scheduling classes and definitions for z13, zEC12 and z196. z10 scheudling is not supported.

Diff Detail

Event Timeline

jonpa updated this revision to Diff 47963.Feb 15 2016, 2:31 AM

jonpa retitled this revision from to SystemZ scheduling implementation.

jonpa updated this object.

jonpa added reviewers: atrick, hfinkel.

jonpa added a subscriber: llvm-commits.

Herald added subscribers: qcolombet, MatzeB. · View Herald TranscriptFeb 15 2016, 2:31 AM

A minor change to make Release builds succeed.

Overall nice job. I won't be able to review your hazard recognizer or any of the SystemZ models.

lib/Target/SystemZ/SystemZISelLowering.cpp
123–125 ↗	(On Diff #47989)	You won't be able to rely on this since SelectionDAG scheduler is deprecated. It's just waiting for a replacement. I think you should focus on the right MI scheduler heuristics for your target. That said, I can see why you did this because your MI scheduler is top-down only so it might be hard for you to control register pressure. As an alternative, we could easily support a two-pass MI scheuler, bottom-up, then top-down.
lib/Target/SystemZ/SystemZInstrInfo.h
171–173	FYI, the "new" machine model is meant to be flexible enough that you don't need to create your own hazard recognizer (you can add predicates and arbitrary pseudo machine resources). However, it's tricky to do that and fine just to use a hazard recognizer when you have complicated decode/issue group constraints.
lib/Target/SystemZ/SystemZScheduleZ13.td
44	In-order scheduling with multiple functional units of the same type is somewhat broken in the generic scheduler. ReservedCycles is only tracking a worst-cast resource availability across all units. It should really be a two-dimensional array. I know this has been fixed out-of-tree at least one target but never pushed back. It would be great if you fixed that!

This revision is now accepted and ready to land.Feb 15 2016, 11:22 AM

jonpa added inline comments.Feb 16 2016, 6:08 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–125 ↗	(On Diff #47989)	I would hope that the pre-ra MI scheduler is bidirectional, due to the overrideSchedPolicy() call, where this is selected. Is there a better /normal way of doing this, perhaps?
lib/Target/SystemZ/SystemZScheduleZ13.td
44	This unit is handled as a special case by the HazardRecognizer, I guess partly because I couldn't see that the code - as you say - did what I wanted. That would be interesting to fix... could perhaps this out-of-tree target push this possibly?

Latencies of z13 vector instructions corrected.

One more test case updated.

jonpa added a reviewer: uweigand.Feb 17 2016, 5:18 AM

atrick added inline comments.Feb 17 2016, 9:44 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	Right, I was looking at your PostRA policy, which is rightly top-down. That said, you might still achieve better register pressure results by a two-pass scheduling approach. I tried hard to wedge all heuristics into a single pass because I was paranoid about compile time. Ultimately it's whatever works for your target, I was just pointing out that SelectionDAG is a bad place for scheduler heuristics We could support a multiple-pass MI scheduler if anyone needs it
lib/Target/SystemZ/SystemZScheduleZ13.td
45	It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree code. Out-of-tree work usually leads to problems. On the other hand, making it easy to write custom, possibly out-of-tree schedulers was a major goal of MI scheduling.

jonpa added inline comments.Feb 17 2016, 11:54 PM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	I am curious as to what you think would be the possibilities of a multi-pass scheduling approach. I have tried this once before like: First do a minimal reg-pressure scheduling. Then increase parallelism (overlapping live intervals) only when it seems to not cause too much spilling. Is this also what you had in mind? Currently this is not needed for SystemZ, as the main focus at least right now is on JIT compilation.

atrick added inline comments.Feb 18 2016, 8:27 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	Yes that's what I had in mind. I didn't take that approach because I wanted to preserve source order in common cases where an out-of-order target has enough registers, as well as compile time.

Latencies corrected, mainly for z13 vector instructions.

Improved modelling of execution units by separating execution units and decoder slots needed for each instruction. Instructions with a double use of exec unit or with a coupled use of the LSU now gets this modelled.

For z13, more of the pipelines have been modelled (FXa, FXb, and the various vector pipelines).

LSU latency corrected to 4.
Latencies properly summed for cracked / expanded instructions. Instructions with joined dispatch have a latency same as for single issue
type instructions.

Tried use an include of commmon defs for different SchedModels, but TableGen
rejected this -- see comment in for example top screen of SystemZScheduleZ13.td.

This patch is on its way, but some regressions need to be fixed first.

jonpa mentioned this in D24451: [LoopUnroller] Replace UnrollingPreferences::Force with ForceMaxCount + SystemZ getUnrollingPreferences()..Sep 20 2016, 4:10 AM

This was already approved before, but that was quite some time ago, so I now reopen this review since my patch then never passed the performance measurements.

This is now *post-RA scheduling only* for SystemZ. Nearly all of the patch belongs to the SystemZ backend.

Herald added subscribers: modocache, mgorny, beanz. · View Herald TranscriptOct 7 2016, 5:01 AM

NFC update per Uli's requests. No longer any common code changes.

uweigand added inline comments.Oct 16 2016, 8:37 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
45	I guess instead of this we could simply use getInstrInfo on the SchedModel.
85	Is there a reason why this is a separate function and no just done directly in ::Reset()?
155	This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name?
lib/Target/SystemZ/SystemZHazardRecognizer.h
121	Why does groupingCost use the return value while resourcesCost uses an output parameter?
lib/Target/SystemZ/SystemZMachineScheduler.h
58	As discussed offline, we really need to get rid of the gobal/static variable here. I note that there is quite a bit of similarily between this "preliminary" sorter and the final sorter in Candidate. Maybe we should actually store "preliminary" candidates in the Available list (with GroupingCost and ResourceCost only set to 0 or 1 depending on whether grouping or reserved resources are involved), and the update the cost parameters with actual values once we known them? We then might even be able to reuse the same comparison routine ...
lib/Target/SystemZ/SystemZScheduleZ13.td
28	We should really try to get this complete, so that instructions added in the future don't accidentally lack scheduling information. How difficult would it be to get there?
104	Ideally, the ordering in these files should mostly correspond to the ordering in the original InstrInfo files, just to make them easier to find ...
105	Also, it is somewhat annoying that we need to list not just the basic instruction definitions, but all the various aliases as well. I'm wondering if there isn't some what to annotate the Alias definition in the main file with the opcode the alias will be resolved in the end, so that can be used for scheduling purposes ...
225	This is not an "And", it's a non-transactional store and should go with the transaction-related instructions.
582	It would be nice to at least separate out vector floating-point instructions, so we can easily see where W variants are needed.
739	I don't think there's a real difference between those and the ones listed under Other.
764	This is just an alias for a LARL and should go there.

Updated per requests.

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

For the rest, please see replies to comments below.

jonpa requested a review of this revision.Oct 18 2016, 5:46 AM

jonpa edited edge metadata.

jonpa added inline comments.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
155	Fixed by using string::substr()/resize() instead. Now all units should really be named per a Z13_XXXUnit pattern.
lib/Target/SystemZ/SystemZMachineScheduler.h
58	I gave this a try, but it thought it was a bit messy to flip the sorting variables back and forth inside a set of Candidates. I instead use the isScheduleHigh flag for groupers / FPd ops, which simplifies the SUSorter method while also eliminating the static variable. The iteration in pickNode() should be nearly unaffected, since these nodes are quite rare.
lib/Target/SystemZ/SystemZScheduleZ13.td
28	I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set CompleteModel (the reason I did not do that before is that AFAICR, this flag then also demanded modeling of operand writes or something of that sort). Worthy of mentioning is that this triggers an error during build for any instruction missing scheduling input for all subtargets. In a debug build, TargetSchedule.cpp will then cause an abort during compilation if the subtarget does not have scheduling input for an emitted instruction.
104	I have reorganized the files completely to match the InstrInfo file sections.
105	Could not as of yet find any way to achieve this, but there might be some way...

jonpa added inline comments.Oct 18 2016, 5:50 AM

lib/Target/SystemZ/SystemZHazardRecognizer.h
121	During experiments I have also returned a cost here for the 'other' processor side. I guess I could change that back now until it's needed again...?

In D17260#572785, @jonpa wrote:

Updated per requests.

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

For the rest, please see replies to comments below.

Looking quite good now, thanks! The one thing I'm still wondering about is the hasNoSchedulingInfo flags. Those are of course fine for the .insn directive, and also for custom-inserter pseudos. However, I don't think we want them for the Asm* branch variants; those are just regular branch instructions that might be used in inline assembler, and they really should have the same scheduling info as the standard form of the branch instructions. I think it should be straightforward to implement this by adding "(Asm)?" to the instregex strings for the branch instructions.

lib/Target/SystemZ/SystemZMachineScheduler.h
58	OK, that makes sense.
lib/Target/SystemZ/SystemZScheduleZ13.td
27	I think this is the default setting, so we can just leave it out here.
28	Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in the not-too-distant future we'll have completed the instruction set for the old processors anyway ...
104	Excellent, it is much more readable now!
105	Oh well, if there's no straightforward way, it's OK the way it is now ...
lib/Target/SystemZ/SystemZScheduleZ196.td
68	What's the EC12 doing here?

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

In D17260#573108, @atrick wrote:

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

This is not about InstAlias records (which are aliases for the assembler) -- those indeed work fine. The issue here was about aliases for the code generator. We use those usually because some instructions need operands on the MC level that we don't want to expose on the MI level. For example, a "return" instruction on SystemZ is actually a "br %r14", but at the MI level this doesn't have any operands, but is simply defined as an alias:

def Return : Alias<2, (outs), (ins), [(z_retflag)]>;

where Alias is formally an Instruction, but doesn't have any information about mnemonic or opcodes. Instead, when this alias is about to be emitted, we emit an actual BR instruction pattern, adding the R14 operand at this point.

This means that to get a complete scheduler model, we have to duplicate the scheduling info for BR also for Return. It would be nice if there were instead a way to say in the definition of Return to just look at BR for scheduling info.

In D17260#573123, @uweigand wrote:
In D17260#573108, @atrick wrote:

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

This is not about InstAlias records (which are aliases for the assembler) -- those indeed work fine. The issue here was about aliases for the code generator. We use those usually because some instructions need operands on the MC level that we don't want to expose on the MI level. For example, a "return" instruction on SystemZ is actually a "br %r14", but at the MI level this doesn't have any operands, but is simply defined as an alias:
def Return : Alias<2, (outs), (ins), [(z_retflag)]>;
where Alias is formally an Instruction, but doesn't have any information about mnemonic or opcodes. Instead, when this alias is about to be emitted, we emit an actual BR instruction pattern, adding the R14 operand at this point.

This means that to get a complete scheduler model, we have to duplicate the scheduling info for BR also for Return. It would be nice if there were instead a way to say in the definition of Return to just look at BR for scheduling info.

The scheduling class is on the MCInstrDesc. AFAIK, there's no abstract way to tie your aliasing pseudo instruction's MCInstrDesc to their lowered instruction's MCInstrDesc. You can try to factor instruction records in the InstrInfo.td file itself by using a SchedRW field, but that's not the way you've structured things. I think the only reasonable thing to do here is duplicate the InstRW records. You've basically told CodeGen that these are two distinct MC instrs.

In D17260#573190, @atrick wrote:

The scheduling class is on the MCInstrDesc. AFAIK, there's no abstract way to tie your aliasing pseudo instruction's MCInstrDesc to their lowered instruction's MCInstrDesc. You can try to factor instruction records in the InstrInfo.td file itself by using a SchedRW field, but that's not the way you've structured things. I think the only reasonable thing to do here is duplicate the InstRW records. You've basically told CodeGen that these are two distinct MC instrs.

Yes, it look like this is what we'll have to do. But I guess that's fine, we don't have all that many of those aliases anyway.

Updated:
Asm* instructions now also get a useful schedclass.
getNumDecoderSlots() fixed to handle generic opcodes correctly.
resourcesCost() returns an int, just like groupingCost()
'CompleteModel = 1' removed

What are your thoughts on asserting for target instructions' sched class desc if CompleteModel is true? (see comment below)

jonpa added inline comments.Oct 19 2016, 4:39 AM

lib/Target/SystemZ/SystemZScheduleZ13.td
27	aah, right.
28	I found that my previous statement on the compile time checking for scheduling input was not actually correct - this does not really cover all instructions: Currently asserts trigger only if All subtargets have omitted the instruction from Schedule .td files. TableGen catches this, and only this because it isn't clever enough to know if a given subtarget (with a missing sched class for an instruction), actually supports that instruction or not. An instruction has a def operand and the sched class does not have a WriteLatency entry for it. This is a specialized assert (in computeOperandLatency()) which doesn't cover all instructions - I could for instance remove the InstRW for a compare and not get any assert triggering. So there really isn't any general assert that checks that for a subtarget with a CompleteModel, all instructions actually emitted have a sane scheduling class. What happened if I removed the InstRW for an instruction, was that it got its own (valid) schedclass for the subtarget, but with just 0 values. I think we could catch the error of forgetting a subtarget/instruction sched annotation with an assert in the scheduler that demands at least one uop for any target instruction. This has at least worked well during my experiments previously. Should I add this just in the SystemZ backend? Or could it be part of the common code somewhere? (I am thinking this should be done both pre-ra and post-ra). Or is there any reason not to demand this that I have missed?
lib/Target/SystemZ/SystemZScheduleZ196.td
68	Good heavens!

uweigand added inline comments.Oct 19 2016, 6:21 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
48	This looks OK to me. However, with this change we should now give a scheduling class to the DirectiveInsn pesudo instruction -- these do emit some instruction, we just don't know which one, so it should probably be modeled as some "generic" instruction.
lib/Target/SystemZ/SystemZHazardRecognizer.h
113	Seems you forgot to update the comment when changing the code :-)
lib/Target/SystemZ/SystemZScheduleZ13.td
28	I don't think doing it in the backend is the right place. In the backend, you only see the instructions that the code being compiled happens to use; and when you do find an error there, there's not much you can do. The right place does seem to be TableGen. And in fact, I had interpreted the code in CodeGenSchedModels::checkCompleteness to do just that check. If this doesn't work as expected, it probably ought to be fixed there. But in any case this is a separate problem and shouldn't hold up this patch.

Updated with an empty sched class for Insn.. instructions, so that getNumDecoderSlots() will return 1 and not 0.

jonpa added inline comments.Oct 19 2016, 6:44 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
48	I used an empty InstRW construct, which seems to do the job.

OK, this looks good to me now. Thanks!

This revision is now accepted and ready to land.Oct 19 2016, 8:22 AM

Commited as r284704.

uweigand mentioned this in D26156: Fix per-processor model scheduler definition completeness check.Oct 31 2016, 9:50 AM

Revision Contents

Path

Size

lib/

Target/

SystemZ/

CMakeLists.txt

2 lines

SystemZ.td

5 lines

SystemZHazardRecognizer.h

128 lines

SystemZHazardRecognizer.cpp

338 lines

SystemZInstrFormats.td

6 lines

SystemZInstrInfo.h

8 lines

SystemZInstrInfo.cpp

35 lines

SystemZInstrInfo.td

13 lines

SystemZMachineScheduler.h

112 lines

SystemZMachineScheduler.cpp

153 lines

SystemZProcessors.td

7 lines

SystemZSchedule.td

70 lines

SystemZScheduleZ13.td

980 lines

SystemZScheduleZ196.td

713 lines

SystemZScheduleZEC12.td

743 lines

SystemZTargetMachine.cpp

16 lines

test/

CodeGen/

SystemZ/

vec-args-06.ll

32 lines

vec-perm-12.ll

6 lines

Diff 75137

lib/Target/SystemZ/CMakeLists.txt

	Show All 11 Lines
	add_public_tablegen_target(SystemZCommonTableGen)			add_public_tablegen_target(SystemZCommonTableGen)

	add_llvm_target(SystemZCodeGen			add_llvm_target(SystemZCodeGen
	SystemZAsmPrinter.cpp			SystemZAsmPrinter.cpp
	SystemZCallingConv.cpp			SystemZCallingConv.cpp
	SystemZConstantPoolValue.cpp			SystemZConstantPoolValue.cpp
	SystemZElimCompare.cpp			SystemZElimCompare.cpp
	SystemZFrameLowering.cpp			SystemZFrameLowering.cpp
				SystemZHazardRecognizer.cpp
	SystemZISelDAGToDAG.cpp			SystemZISelDAGToDAG.cpp
	SystemZISelLowering.cpp			SystemZISelLowering.cpp
	SystemZInstrInfo.cpp			SystemZInstrInfo.cpp
	SystemZLDCleanup.cpp			SystemZLDCleanup.cpp
	SystemZLongBranch.cpp			SystemZLongBranch.cpp
	SystemZMachineFunctionInfo.cpp			SystemZMachineFunctionInfo.cpp
				SystemZMachineScheduler.cpp
	SystemZMCInstLower.cpp			SystemZMCInstLower.cpp
	SystemZRegisterInfo.cpp			SystemZRegisterInfo.cpp
	SystemZSelectionDAGInfo.cpp			SystemZSelectionDAGInfo.cpp
	SystemZShortenInst.cpp			SystemZShortenInst.cpp
	SystemZSubtarget.cpp			SystemZSubtarget.cpp
	SystemZTargetMachine.cpp			SystemZTargetMachine.cpp
	SystemZTargetTransformInfo.cpp			SystemZTargetTransformInfo.cpp
	SystemZTDC.cpp			SystemZTDC.cpp
	)			)

	add_subdirectory(AsmParser)			add_subdirectory(AsmParser)
	add_subdirectory(Disassembler)			add_subdirectory(Disassembler)
	add_subdirectory(InstPrinter)			add_subdirectory(InstPrinter)
	add_subdirectory(TargetInfo)			add_subdirectory(TargetInfo)
	add_subdirectory(MCTargetDesc)			add_subdirectory(MCTargetDesc)

lib/Target/SystemZ/SystemZ.td

	//===-- SystemZ.td - Describe the SystemZ target machine ------ tblgen --===//			//===-- SystemZ.td - Describe the SystemZ target machine ------ tblgen --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Target-independent interfaces which we are implementing			// Target-independent interfaces which we are implementing
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "llvm/Target/Target.td"			include "llvm/Target/Target.td"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// SystemZ subtargets scheduling models.
				//===----------------------------------------------------------------------===//
				include "SystemZSchedule.td"

				//===----------------------------------------------------------------------===//
	// SystemZ supported processors and features			// SystemZ supported processors and features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "SystemZProcessors.td"			include "SystemZProcessors.td"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Register file description			// Register file description
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	Show All 39 Lines

lib/Target/SystemZ/SystemZHazardRecognizer.h

This file was added.

				//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file declares a hazard recognizer for the SystemZ scheduler.
				//
				// This class is used by the SystemZ scheduling strategy to maintain
				// the state during scheduling, and provide cost functions for
				// scheduling candidates. This includes:
				//
				// * Decoder grouping. A decoder group can maximally hold 3 uops, and
				// instructions that always begin a new group should be scheduled when
				// the current decoder group is empty.
				// * Processor resources usage. It is beneficial to balance the use of
				// resources.
				//
				// ===---------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H
				#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H

				#include "SystemZSubtarget.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/MachineScheduler.h"
				#include "llvm/CodeGen/ScheduleHazardRecognizer.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/MC/MCInstrDesc.h"
				#include "llvm/Support/raw_ostream.h"
				#include <string>

				namespace llvm {

				/// SystemZHazardRecognizer maintains the state during scheduling.
				class SystemZHazardRecognizer : public ScheduleHazardRecognizer {

				ScheduleDAGMI *DAG;
				const TargetSchedModel *SchedModel;

				/// Keep track of the number of decoder slots used in the current
				/// decoder group.
				unsigned CurrGroupSize;

				/// The tracking of resources here are quite similar to the common
				/// code use of a critical resource. However, z13 differs in the way
				/// that it has two processor sides which may be interesting to
				/// model in the future (a work in progress).

				/// Counters for the number of uops scheduled per processor
				/// resource.
				SmallVector<int, 0> ProcResourceCounters;

				/// This is the resource with the greatest queue, which the
				/// scheduler tries to avoid.
				unsigned CriticalResourceIdx;

				/// Return the number of decoder slots MI requires.
				inline unsigned getNumDecoderSlots(SUnit *SU) const;

				/// Return true if MI fits into current decoder group.
				bool fitsIntoCurrentGroup(SUnit *SU) const;

				/// Two decoder groups per cycle are formed (for z13), meaning 2x3
				/// instructions. This function returns a number between 0 and 5,
				/// representing the current decoder slot of the current cycle.
				unsigned getCurrCycleIdx();

				/// LastFPdOpCycleIdx stores the numbeer returned by getCurrCycleIdx()
				/// when a stalling operation is scheduled (which uses the FPd resource).
				unsigned LastFPdOpCycleIdx;

				/// A counter of decoder groups scheduled.
				unsigned GrpCount;

				unsigned getCurrGroupSize() {return CurrGroupSize;};

				/// Start next decoder group.
				void nextGroup(bool DbgOutput = true);

				/// Clear all counters for processor resources.
				void clearProcResCounters();

				/// With the goal of alternating processor sides for stalling (FPd)
				/// ops, return true if it seems good to schedule an FPd op next.
				bool isFPdOpPreferred_distance(const SUnit *SU);

				public:
				SystemZHazardRecognizer(const MachineSchedContext *C);

				void setDAG(ScheduleDAGMI *dag) {
				DAG = dag;
				SchedModel = dag->getSchedModel();
				}

				HazardType getHazardType(SUnit *m, int Stalls = 0) override;
				void Reset() override;
				void EmitInstruction(SUnit *SU) override;

				// Cost functions used by SystemZPostRASchedStrategy while
				// evaluating candidates.

				/// Return the cost of decoder grouping for SU. If SU must start a
				/// new decoder group, this is negative if this fits the schedule or
				/// positive if it would mean ending a group prematurely. For normal
				/// instructions this returns 0.
				int groupingCost(SUnit *SU) const;

				/// Return the cost of SU in regards to processor resources usage.
				/// A positive value means it would be better to wait with SU, while
				uweigandUnsubmitted Done Reply Inline Actions Seems you forgot to update the comment when changing the code :-) uweigand: Seems you forgot to update the comment when changing the code :-)
				/// a negative value means it would be good to schedule SU next.
				int resourcesCost(SUnit *SU);

				#ifndef NDEBUG
				// Debug dumping.
				std::string CurGroupDbg; // current group as text
				void dumpSU(SUnit *SU, raw_ostream &OS) const;
				void dumpCurrGroup(std::string Msg = "") const;
				uweigandUnsubmitted Not Done Reply Inline Actions Why does groupingCost use the return value while resourcesCost uses an output parameter? uweigand: Why does groupingCost use the return value while resourcesCost uses an output parameter?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions During experiments I have also returned a cost here for the 'other' processor side. I guess I could change that back now until it's needed again...? jonpa: During experiments I have also returned a cost here for the 'other' processor side. I guess I…
				void dumpProcResourceCounters() const;
				#endif
				};

				} // namespace llvm

				#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H */

lib/Target/SystemZ/SystemZHazardRecognizer.cpp

This file was added.

				//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines a hazard recognizer for the SystemZ scheduler.
				//
				// This class is used by the SystemZ scheduling strategy to maintain
				// the state during scheduling, and provide cost functions for
				// scheduling candidates. This includes:
				//
				// * Decoder grouping. A decoder group can maximally hold 3 uops, and
				// instructions that always begin a new group should be scheduled when
				// the current decoder group is empty.
				// * Processor resources usage. It is beneficial to balance the use of
				// resources.
				//
				// ===---------------------------------------------------------------------===//

				#include "SystemZHazardRecognizer.h"
				#include "llvm/ADT/Statistic.h"

				using namespace llvm;

				#define DEBUG_TYPE "misched"

				// This is the limit of processor resource usage at which the
				// scheduler should try to look for other instructions (not using the
				// critical resource).
				cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,
				cl::desc("The OOO window for processor "
				"resources during scheduling."),
				cl::init(8));

				SystemZHazardRecognizer::
				SystemZHazardRecognizer(const MachineSchedContext *C) : DAG(nullptr),
				SchedModel(nullptr) {}

				unsigned SystemZHazardRecognizer::
				getNumDecoderSlots(SUnit *SU) const {
				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				uweigandUnsubmitted Done Reply Inline Actions I guess instead of this we could simply use getInstrInfo on the SchedModel. uweigand: I guess instead of this we could simply use getInstrInfo on the SchedModel.
				if (!SC->isValid())
				return 0; // IMPLICIT_DEF / KILL -- will not make impact in output.

				uweigandUnsubmitted Done Reply Inline Actions This looks OK to me. However, with this change we should now give a scheduling class to the DirectiveInsn pesudo instruction -- these do emit some instruction, we just don't know which one, so it should probably be modeled as some "generic" instruction. uweigand: This looks OK to me. However, with this change we should now give a scheduling class to the…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I used an empty InstRW construct, which seems to do the job. jonpa: I used an empty InstRW construct, which seems to do the job.
				if (SC->BeginGroup) {
				if (!SC->EndGroup)
				return 2; // Cracked instruction
				else
				return 3; // Expanded/group-alone instruction
				}

				return 1; // Normal instruction
				}

				unsigned SystemZHazardRecognizer::getCurrCycleIdx() {
				unsigned Idx = CurrGroupSize;
				if (GrpCount % 2)
				Idx += 3;
				return Idx;
				}

				ScheduleHazardRecognizer::HazardType SystemZHazardRecognizer::
				getHazardType(SUnit *m, int Stalls) {
				return (fitsIntoCurrentGroup(m) ? NoHazard : Hazard);
				}

				void SystemZHazardRecognizer::Reset() {
				CurrGroupSize = 0;
				clearProcResCounters();
				GrpCount = 0;
				LastFPdOpCycleIdx = UINT_MAX;
				DEBUG(CurGroupDbg = "";);
				}

				bool
				SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const {
				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				if (!SC->isValid())
				return true;

				// A cracked instruction only fits into schedule if the current
				uweigandUnsubmitted Done Reply Inline Actions Is there a reason why this is a separate function and no just done directly in ::Reset()? uweigand: Is there a reason why this is a separate function and no just done directly in ::Reset()?
				// group is empty.
				if (SC->BeginGroup)
				return (CurrGroupSize == 0);

				// Since a full group is handled immediately in EmitInstruction(),
				// SU should fit into current group. NumSlots should be 1 or 0,
				// since it is not a cracked or expanded instruction.
				assert ((getNumDecoderSlots(SU) <= 1) && (CurrGroupSize < 3) &&
				"Expected normal instruction to fit in non-full group!");

				return true;
				}

				void SystemZHazardRecognizer::nextGroup(bool DbgOutput) {
				if (CurrGroupSize > 0) {
				DEBUG(dumpCurrGroup("Completed decode group"));
				DEBUG(CurGroupDbg = "";);

				GrpCount++;

				// Reset counter for next group.
				CurrGroupSize = 0;

				// Decrease counters for execution units by one.
				for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0)
				ProcResourceCounters[i]--;

				// Clear CriticalResourceIdx if it is now below the threshold.
				if (CriticalResourceIdx != UINT_MAX &&
				(ProcResourceCounters[CriticalResourceIdx] <=
				ProcResCostLim))
				CriticalResourceIdx = UINT_MAX;
				}

				DEBUG(if (DbgOutput)
				dumpProcResourceCounters(););
				}

				#ifndef NDEBUG // Debug output
				void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const {
				OS << "SU(" << SU->NodeNum << "):";
				OS << SchedModel->getInstrInfo()->getName(SU->getInstr()->getOpcode());

				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				if (!SC->isValid())
				return;

				for (TargetSchedModel::ProcResIter
				PI = SchedModel->getWriteProcResBegin(SC),
				PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {
				const MCProcResourceDesc &PRD =
				*SchedModel->getProcResource(PI->ProcResourceIdx);
				std::string FU(PRD.Name);
				// trim e.g. Z13_FXaUnit -> FXa
				FU = FU.substr(FU.find("_") + 1);
				FU.resize(FU.find("Unit"));
				OS << "/" << FU;

				if (PI->Cycles > 1)
				OS << "(" << PI->Cycles << "cyc)";
				}

				if (SC->NumMicroOps > 1)
				OS << "/" << SC->NumMicroOps << "uops";
				if (SC->BeginGroup && SC->EndGroup)
				OS << "/GroupsAlone";
				else if (SC->BeginGroup)
				OS << "/BeginsGroup";
				else if (SC->EndGroup)
				uweigandUnsubmitted Done Reply Inline Actions This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name? uweigand: This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Fixed by using string::substr()/resize() instead. Now all units should really be named per a Z13_XXXUnit pattern. jonpa: Fixed by using string::substr()/resize() instead. Now all units should really be named per a…
				OS << "/EndsGroup";
				if (SU->isUnbuffered)
				OS << "/Unbuffered";
				}

				void SystemZHazardRecognizer::dumpCurrGroup(std::string Msg) const {
				dbgs() << "+++ " << Msg;
				dbgs() << ": ";

				if (CurGroupDbg.empty())
				dbgs() << " <empty>\n";
				else {
				dbgs() << "{ " << CurGroupDbg << " }";
				dbgs() << " (" << CurrGroupSize << " decoder slot"
				<< (CurrGroupSize > 1 ? "s":"")
				<< ")\n";
				}
				}

				void SystemZHazardRecognizer::dumpProcResourceCounters() const {
				bool any = false;

				for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0) {
				any = true;
				break;
				}

				if (!any)
				return;

				dbgs() << "+++ Resource counters:\n";
				for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0) {
				dbgs() << "+++ Extra schedule for execution unit "
				<< SchedModel->getProcResource(i)->Name
				<< ": " << ProcResourceCounters[i] << "\n";
				any = true;
				}
				}
				#endif //NDEBUG

				void SystemZHazardRecognizer::clearProcResCounters() {
				ProcResourceCounters.assign(SchedModel->getNumProcResourceKinds(), 0);
				CriticalResourceIdx = UINT_MAX;
				}

				// Update state with MI as next instruction. If SU is null, this
				// is during the advancing past instructions not subject to scheduling.
				void SystemZHazardRecognizer::
				EmitInstruction(SUnit *SU) {
				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				DEBUG( dumpCurrGroup("Decode group before emission"););

				// If scheduling an MI that must begin a new decoder group, move on
				// to next group.
				if (!fitsIntoCurrentGroup(SU))
				nextGroup();

				DEBUG( dbgs() << "+++ HazardRecognizer emitting "; dumpSU(SU, dbgs());
				dbgs() << "\n";
				raw_string_ostream cgd(CurGroupDbg);
				if (CurGroupDbg.length())
				cgd << ", ";
				dumpSU(SU, cgd););

				// After returning from a call, we don't know much about the state.
				if (SU->getInstr()->isCall()) {
				DEBUG (dbgs() << "+++ Clearing state after call.\n";);
				clearProcResCounters();
				LastFPdOpCycleIdx = UINT_MAX;
				CurrGroupSize += getNumDecoderSlots(SU);
				assert (CurrGroupSize <= 3);
				nextGroup();
				return;
				}

				// Increase counter for execution unit(s).
				for (TargetSchedModel::ProcResIter
				PI = SchedModel->getWriteProcResBegin(SC),
				PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {
				// Don't handle FPd together with the other resources.
				if (SchedModel->getProcResource(PI->ProcResourceIdx)->BufferSize == 1)
				continue;
				int &CurrCounter =
				ProcResourceCounters[PI->ProcResourceIdx];
				CurrCounter += PI->Cycles;
				// Check if this is now the new critical resource.
				if ((CurrCounter > ProcResCostLim) &&
				(CriticalResourceIdx == UINT_MAX \|\|
				(PI->ProcResourceIdx != CriticalResourceIdx &&
				CurrCounter >
				ProcResourceCounters[CriticalResourceIdx]))) {
				DEBUG( dbgs() << "+++ New critical resource: "
				<< SchedModel->getProcResource(PI->ProcResourceIdx)->Name
				<< "\n";);
				CriticalResourceIdx = PI->ProcResourceIdx;
				}
				}

				// Make note of an instruction that uses a blocking resource (FPd).
				if (SU->isUnbuffered) {
				LastFPdOpCycleIdx = getCurrCycleIdx();
				DEBUG (dbgs() << "+++ Last FPd cycle index: "
				<< LastFPdOpCycleIdx << "\n";);
				}

				// Insert SU into current group by increasing number of slots used
				// in current group.
				CurrGroupSize += getNumDecoderSlots(SU);
				assert (CurrGroupSize <= 3);

				// Check if current group is now full/ended. If so, move on to next
				// group to be ready to evaluate more candidates.
				if (CurrGroupSize == 3 \|\| SC->EndGroup)
				nextGroup();
				}

				int SystemZHazardRecognizer::groupingCost(SUnit *SU) const {
				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				if (!SC->isValid())
				return 0;

				// If SU begins new group, it can either break a current group early
				// or fit naturally if current group is empty (negative cost).
				if (SC->BeginGroup) {
				if (CurrGroupSize)
				return 3 - CurrGroupSize;
				return -1;
				}

				// Similarly, a group-ending SU may either fit well (last in group), or
				// end the group prematurely.
				if (SC->EndGroup) {
				unsigned resultingGroupSize =
				(CurrGroupSize + getNumDecoderSlots(SU));
				if (resultingGroupSize < 3)
				return (3 - resultingGroupSize);
				return -1;
				}

				// Most instructions can be placed in any decoder slot.
				return 0;
				}

				bool SystemZHazardRecognizer::isFPdOpPreferred_distance(const SUnit *SU) {
				assert (SU->isUnbuffered);
				// If this is the first FPd op, it should be scheduled high.
				if (LastFPdOpCycleIdx == UINT_MAX)
				return true;
				// If this is not the first PFd op, it should go into the other side
				// of the processor to use the other FPd unit there. This should
				// generally happen if two FPd ops are placed with 2 other
				// instructions between them (modulo 6).
				if (LastFPdOpCycleIdx > getCurrCycleIdx())
				return ((LastFPdOpCycleIdx - getCurrCycleIdx()) == 3);
				return ((getCurrCycleIdx() - LastFPdOpCycleIdx) == 3);
				}

				int SystemZHazardRecognizer::
				resourcesCost(SUnit *SU) {
				int Cost = 0;

				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				if (!SC->isValid())
				return 0;

				// For a FPd op, either return min or max value as indicated by the
				// distance to any prior FPd op.
				if (SU->isUnbuffered)
				Cost = (isFPdOpPreferred_distance(SU) ? INT_MIN : INT_MAX);
				// For other instructions, give a cost to the use of the critical resource.
				else if (CriticalResourceIdx != UINT_MAX) {
				for (TargetSchedModel::ProcResIter
				PI = SchedModel->getWriteProcResBegin(SC),
				PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)
				if (PI->ProcResourceIdx == CriticalResourceIdx)
				Cost = PI->Cycles;
				}

				return Cost;
				}

lib/Target/SystemZ/SystemZInstrFormats.td

Show First 20 Lines • Show All 2,775 Lines • ▼ Show 20 Lines	class AtomicLoadBinary<SDPatternOperator operator, RegisterOperand cls,
dag pat, DAGOperand operand>		dag pat, DAGOperand operand>
: Pseudo<(outs cls:$dst), (ins bdaddr20only:$ptr, operand:$src2),		: Pseudo<(outs cls:$dst), (ins bdaddr20only:$ptr, operand:$src2),
[(set cls:$dst, (operator bdaddr20only:$ptr, pat))]> {		[(set cls:$dst, (operator bdaddr20only:$ptr, pat))]> {
let Defs = [CC];		let Defs = [CC];
let Has20BitOffset = 1;		let Has20BitOffset = 1;
let mayLoad = 1;		let mayLoad = 1;
let mayStore = 1;		let mayStore = 1;
let usesCustomInserter = 1;		let usesCustomInserter = 1;
		let hasNoSchedulingInfo = 1;
}		}

// Specializations of AtomicLoadWBinary.		// Specializations of AtomicLoadWBinary.
class AtomicLoadBinaryReg32<SDPatternOperator operator>		class AtomicLoadBinaryReg32<SDPatternOperator operator>
: AtomicLoadBinary<operator, GR32, (i32 GR32:$src2), GR32>;		: AtomicLoadBinary<operator, GR32, (i32 GR32:$src2), GR32>;
class AtomicLoadBinaryImm32<SDPatternOperator operator, Immediate imm>		class AtomicLoadBinaryImm32<SDPatternOperator operator, Immediate imm>
: AtomicLoadBinary<operator, GR32, (i32 imm:$src2), imm>;		: AtomicLoadBinary<operator, GR32, (i32 imm:$src2), imm>;
class AtomicLoadBinaryReg64<SDPatternOperator operator>		class AtomicLoadBinaryReg64<SDPatternOperator operator>
Show All 10 Lines	: Pseudo<(outs GR32:$dst),
ADDR32:$negbitshift, uimm32:$bitsize),		ADDR32:$negbitshift, uimm32:$bitsize),
[(set GR32:$dst, (operator bdaddr20only:$ptr, pat, ADDR32:$bitshift,		[(set GR32:$dst, (operator bdaddr20only:$ptr, pat, ADDR32:$bitshift,
ADDR32:$negbitshift, uimm32:$bitsize))]> {		ADDR32:$negbitshift, uimm32:$bitsize))]> {
let Defs = [CC];		let Defs = [CC];
let Has20BitOffset = 1;		let Has20BitOffset = 1;
let mayLoad = 1;		let mayLoad = 1;
let mayStore = 1;		let mayStore = 1;
let usesCustomInserter = 1;		let usesCustomInserter = 1;
		let hasNoSchedulingInfo = 1;
}		}

// Specializations of AtomicLoadWBinary.		// Specializations of AtomicLoadWBinary.
class AtomicLoadWBinaryReg<SDPatternOperator operator>		class AtomicLoadWBinaryReg<SDPatternOperator operator>
: AtomicLoadWBinary<operator, (i32 GR32:$src2), GR32>;		: AtomicLoadWBinary<operator, (i32 GR32:$src2), GR32>;
class AtomicLoadWBinaryImm<SDPatternOperator operator, Immediate imm>		class AtomicLoadWBinaryImm<SDPatternOperator operator, Immediate imm>
: AtomicLoadWBinary<operator, (i32 imm:$src2), imm>;		: AtomicLoadWBinary<operator, (i32 imm:$src2), imm>;

// Define an instruction that operates on two fixed-length blocks of memory,		// Define an instruction that operates on two fixed-length blocks of memory,
// and associated pseudo instructions for operating on blocks of any size.		// and associated pseudo instructions for operating on blocks of any size.
// The Sequence form uses a straight-line sequence of instructions and		// The Sequence form uses a straight-line sequence of instructions and
// the Loop form uses a loop of length-256 instructions followed by		// the Loop form uses a loop of length-256 instructions followed by
// another instruction to handle the excess.		// another instruction to handle the excess.
multiclass MemorySS<string mnemonic, bits<8> opcode,		multiclass MemorySS<string mnemonic, bits<8> opcode,
SDPatternOperator sequence, SDPatternOperator loop> {		SDPatternOperator sequence, SDPatternOperator loop> {
def "" : InstSS<opcode, (outs), (ins bdladdr12onlylen8:$BDL1,		def "" : InstSS<opcode, (outs), (ins bdladdr12onlylen8:$BDL1,
bdaddr12only:$BD2),		bdaddr12only:$BD2),
mnemonic##"\t$BDL1, $BD2", []>;		mnemonic##"\t$BDL1, $BD2", []>;
let usesCustomInserter = 1 in {		let usesCustomInserter = 1, hasNoSchedulingInfo = 1 in {
def Sequence : Pseudo<(outs), (ins bdaddr12only:$dest, bdaddr12only:$src,		def Sequence : Pseudo<(outs), (ins bdaddr12only:$dest, bdaddr12only:$src,
imm64:$length),		imm64:$length),
[(sequence bdaddr12only:$dest, bdaddr12only:$src,		[(sequence bdaddr12only:$dest, bdaddr12only:$src,
imm64:$length)]>;		imm64:$length)]>;
def Loop : Pseudo<(outs), (ins bdaddr12only:$dest, bdaddr12only:$src,		def Loop : Pseudo<(outs), (ins bdaddr12only:$dest, bdaddr12only:$src,
imm64:$length, GR64:$count256),		imm64:$length, GR64:$count256),
[(loop bdaddr12only:$dest, bdaddr12only:$src,		[(loop bdaddr12only:$dest, bdaddr12only:$src,
imm64:$length, GR64:$count256)]>;		imm64:$length, GR64:$count256)]>;
Show All 9 Lines	multiclass StringRRE<string mnemonic, bits<16> opcode,
SDPatternOperator operator> {		SDPatternOperator operator> {
def "" : InstRRE<opcode, (outs GR64:$R1, GR64:$R2),		def "" : InstRRE<opcode, (outs GR64:$R1, GR64:$R2),
(ins GR64:$R1src, GR64:$R2src),		(ins GR64:$R1src, GR64:$R2src),
mnemonic#"\t$R1, $R2", []> {		mnemonic#"\t$R1, $R2", []> {
let Uses = [R0L];		let Uses = [R0L];
let Constraints = "$R1 = $R1src, $R2 = $R2src";		let Constraints = "$R1 = $R1src, $R2 = $R2src";
let DisableEncoding = "$R1src, $R2src";		let DisableEncoding = "$R1src, $R2src";
}		}
let usesCustomInserter = 1 in		let usesCustomInserter = 1, hasNoSchedulingInfo = 1 in
def Loop : Pseudo<(outs GR64:$end),		def Loop : Pseudo<(outs GR64:$end),
(ins GR64:$start1, GR64:$start2, GR32:$char),		(ins GR64:$start1, GR64:$start2, GR32:$char),
[(set GR64:$end, (operator GR64:$start1, GR64:$start2,		[(set GR64:$end, (operator GR64:$start1, GR64:$start2,
GR32:$char))]>;		GR32:$char))]>;
}		}

// A pseudo instruction that is a direct alias of a real instruction.		// A pseudo instruction that is a direct alias of a real instruction.
// These aliases are used in cases where a particular register operand is		// These aliases are used in cases where a particular register operand is
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrInfo.h

Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	public:
bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,		bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify) const override;		bool AllowModify) const override;
unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
int *BytesRemoved = nullptr) const override;		int *BytesRemoved = nullptr) const override;
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const override;		int *BytesAdded = nullptr) const override;
bool analyzeCompare(const MachineInstr &MI, unsigned &SrcReg,		bool analyzeCompare(const MachineInstr &MI, unsigned &SrcReg,
		atrickUnsubmitted Not Done Reply Inline Actions FYI, the "new" machine model is meant to be flexible enough that you don't need to create your own hazard recognizer (you can add predicates and arbitrary pseudo machine resources). However, it's tricky to do that and fine just to use a hazard recognizer when you have complicated decode/issue group constraints. atrick: FYI, the "new" machine model is meant to be flexible enough that you don't need to create your…
unsigned &SrcReg2, int &Mask, int &Value) const override;		unsigned &SrcReg2, int &Mask, int &Value) const override;
bool optimizeCompareInstr(MachineInstr &CmpInstr, unsigned SrcReg,		bool optimizeCompareInstr(MachineInstr &CmpInstr, unsigned SrcReg,
unsigned SrcReg2, int Mask, int Value,		unsigned SrcReg2, int Mask, int Value,
const MachineRegisterInfo *MRI) const override;		const MachineRegisterInfo *MRI) const override;
bool isPredicable(MachineInstr &MI) const override;		bool isPredicable(MachineInstr &MI) const override;
bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,		bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,
unsigned ExtraPredCycles,		unsigned ExtraPredCycles,
BranchProbability Probability) const override;		BranchProbability Probability) const override;
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	unsigned getFusedCompare(unsigned Opcode,
SystemZII::FusedCompareType Type,		SystemZII::FusedCompareType Type,
const MachineInstr *MI = nullptr) const;		const MachineInstr *MI = nullptr) const;

// Emit code before MBBI in MI to move immediate value Value into		// Emit code before MBBI in MI to move immediate value Value into
// physical register Reg.		// physical register Reg.
void loadImmediate(MachineBasicBlock &MBB,		void loadImmediate(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
unsigned Reg, uint64_t Value) const;		unsigned Reg, uint64_t Value) const;

		// Sometimes, it is possible for the target to tell, even without
		// aliasing information, that two MIs access different memory
		// addresses. This function returns true if two MIs access different
		// memory addresses and false otherwise.
		bool
		areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb,
		AliasAnalysis *AA = nullptr) const override;
};		};
} // end namespace llvm		} // end namespace llvm

#endif		#endif

lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 1,510 Lines • ▼ Show 20 Lines	else if (SystemZ::isImmLH(Value)) {
Opcode = SystemZ::LLILH;		Opcode = SystemZ::LLILH;
Value >>= 16;		Value >>= 16;
} else {		} else {
assert(isInt<32>(Value) && "Huge values not handled yet");		assert(isInt<32>(Value) && "Huge values not handled yet");
Opcode = SystemZ::LGFI;		Opcode = SystemZ::LGFI;
}		}
BuildMI(MBB, MBBI, DL, get(Opcode), Reg).addImm(Value);		BuildMI(MBB, MBBI, DL, get(Opcode), Reg).addImm(Value);
}		}

		bool SystemZInstrInfo::
		areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb,
		AliasAnalysis *AA) const {

		if (!MIa.hasOneMemOperand() \|\| !MIb.hasOneMemOperand())
		return false;

		// If mem-operands show that the same address Value is used by both
		// instructions, check for non-overlapping offsets and widths. Not
		// sure if a register based analysis would be an improvement...

		MachineMemOperand MMOa = MIa.memoperands_begin();
		MachineMemOperand MMOb = MIb.memoperands_begin();
		const Value *VALa = MMOa->getValue();
		const Value *VALb = MMOb->getValue();
		bool SameVal = (VALa && VALb && (VALa == VALb));
		if (!SameVal) {
		const PseudoSourceValue *PSVa = MMOa->getPseudoValue();
		const PseudoSourceValue *PSVb = MMOb->getPseudoValue();
		if (PSVa && PSVb && (PSVa == PSVb))
		SameVal = true;
		}
		if (SameVal) {
		int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
		int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
		int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
		int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
		int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
		if (LowOffset + LowWidth <= HighOffset)
		return true;
		}

		return false;
		}

lib/Target/SystemZ/SystemZInstrInfo.td

//===-- SystemZInstrInfo.td - General SystemZ instructions ----- tblgen--===//		//===-- SystemZInstrInfo.td - General SystemZ instructions ----- tblgen--===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Stack allocation		// Stack allocation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		let hasNoSchedulingInfo = 1 in {
def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt),		def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt),
[(callseq_start timm:$amt)]>;		[(callseq_start timm:$amt)]>;
def ADJCALLSTACKUP : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2),		def ADJCALLSTACKUP : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2),
[(callseq_end timm:$amt1, timm:$amt2)]>;		[(callseq_end timm:$amt1, timm:$amt2)]>;
		}

let hasSideEffects = 0 in {		let hasSideEffects = 0 in {
// Takes as input the value of the stack pointer after a dynamic allocation		// Takes as input the value of the stack pointer after a dynamic allocation
// has been made. Sets the output to the address of the dynamically-		// has been made. Sets the output to the address of the dynamically-
// allocated area itself, skipping the outgoing arguments.		// allocated area itself, skipping the outgoing arguments.
//		//
// This expands to an LA or LAY instruction. We restrict the offset		// This expands to an LA or LAY instruction. We restrict the offset
// to the range of LA and keep the LAY range in reserve for when		// to the range of LA and keep the LAY range in reserve for when
▲ Show 20 Lines • Show All 1,379 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// A serialization instruction that acts as a barrier for all memory		// A serialization instruction that acts as a barrier for all memory
// accesses, which expands to "bcr 14, 0".		// accesses, which expands to "bcr 14, 0".
let hasSideEffects = 1 in		let hasSideEffects = 1 in
def Serialize : Alias<2, (outs), (ins), [(z_serialize)]>;		def Serialize : Alias<2, (outs), (ins), [(z_serialize)]>;

// A pseudo instruction that serves as a compiler barrier.		// A pseudo instruction that serves as a compiler barrier.
let hasSideEffects = 1 in		let hasSideEffects = 1, hasNoSchedulingInfo = 1 in
def MemBarrier : Pseudo<(outs), (ins), [(z_membarrier)]>;		def MemBarrier : Pseudo<(outs), (ins), [(z_membarrier)]>;

let Predicates = [FeatureInterlockedAccess1], Defs = [CC] in {		let Predicates = [FeatureInterlockedAccess1], Defs = [CC] in {
def LAA : LoadAndOpRSY<"laa", 0xEBF8, atomic_load_add_32, GR32>;		def LAA : LoadAndOpRSY<"laa", 0xEBF8, atomic_load_add_32, GR32>;
def LAAG : LoadAndOpRSY<"laag", 0xEBE8, atomic_load_add_64, GR64>;		def LAAG : LoadAndOpRSY<"laag", 0xEBE8, atomic_load_add_64, GR64>;
def LAAL : LoadAndOpRSY<"laal", 0xEBFA, null_frag, GR32>;		def LAAL : LoadAndOpRSY<"laal", 0xEBFA, null_frag, GR32>;
def LAALG : LoadAndOpRSY<"laalg", 0xEBEA, null_frag, GR64>;		def LAALG : LoadAndOpRSY<"laalg", 0xEBEA, null_frag, GR64>;
def LAN : LoadAndOpRSY<"lan", 0xEBF4, atomic_load_and_32, GR32>;		def LAN : LoadAndOpRSY<"lan", 0xEBF4, atomic_load_and_32, GR32>;
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	: Pseudo<(outs GR32:$dst), (ins bdaddr20only:$addr, GR32:$cmp, GR32:$swap,
[(set GR32:$dst,		[(set GR32:$dst,
(z_atomic_cmp_swapw bdaddr20only:$addr, GR32:$cmp, GR32:$swap,		(z_atomic_cmp_swapw bdaddr20only:$addr, GR32:$cmp, GR32:$swap,
ADDR32:$bitshift, ADDR32:$negbitshift,		ADDR32:$bitshift, ADDR32:$negbitshift,
uimm32:$bitsize))]> {		uimm32:$bitsize))]> {
let Defs = [CC];		let Defs = [CC];
let mayLoad = 1;		let mayLoad = 1;
let mayStore = 1;		let mayStore = 1;
let usesCustomInserter = 1;		let usesCustomInserter = 1;
		let hasNoSchedulingInfo = 1;
}		}

let Defs = [CC] in {		let Defs = [CC] in {
defm CS : CmpSwapRSPair<"cs", 0xBA, 0xEB14, atomic_cmp_swap_32, GR32>;		defm CS : CmpSwapRSPair<"cs", 0xBA, 0xEB14, atomic_cmp_swap_32, GR32>;
def CSG : CmpSwapRSY<"csg", 0xEB30, atomic_cmp_swap_64, GR64>;		def CSG : CmpSwapRSY<"csg", 0xEB30, atomic_cmp_swap_64, GR64>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 326 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZMachineScheduler.h

This file was added.

				//==-- SystemZMachineScheduler.h - SystemZ Scheduler Interface -- C++ ----==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// -------------------------- Post RA scheduling ---------------------------- //
				// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into
				// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
				// implementation that looks to optimize decoder grouping and balance the
				// usage of processor resources.
				//===----------------------------------------------------------------------===//

				#include "SystemZInstrInfo.h"
				#include "SystemZHazardRecognizer.h"
				#include "llvm/CodeGen/MachineScheduler.h"
				#include "llvm/Support/Debug.h"

				#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H
				#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H

				using namespace llvm;

				namespace llvm {

				/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.
				class SystemZPostRASchedStrategy : public MachineSchedStrategy {
				ScheduleDAGMI *DAG;

				/// A candidate during instruction evaluation.
				struct Candidate {
				SUnit *SU;

				/// The decoding cost.
				int GroupingCost;

				/// The processor resources cost.
				int ResourcesCost;

				Candidate() : SU(nullptr), GroupingCost(0), ResourcesCost(0) {}
				Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec);

				// Compare two candidates.
				bool operator<(const Candidate &other);

				// Check if this node is free of cost ("as good as any").
				bool inline noCost() {
				return (GroupingCost <= 0 && !ResourcesCost);
				}
				};

				// A sorter for the Available set that makes sure that SUs are considered
				// in the best order.
				struct SUSorter {
				bool operator() (SUnit lhs, SUnit rhs) const {
				uweigandUnsubmitted Done Reply Inline Actions As discussed offline, we really need to get rid of the gobal/static variable here. I note that there is quite a bit of similarily between this "preliminary" sorter and the final sorter in Candidate. Maybe we should actually store "preliminary" candidates in the Available list (with GroupingCost and ResourceCost only set to 0 or 1 depending on whether grouping or reserved resources are involved), and the update the cost parameters with actual values once we known them? We then might even be able to reuse the same comparison routine ... uweigand: As discussed offline, we really need to get rid of the gobal/static variable here. I note…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I gave this a try, but it thought it was a bit messy to flip the sorting variables back and forth inside a set of Candidates. I instead use the isScheduleHigh flag for groupers / FPd ops, which simplifies the SUSorter method while also eliminating the static variable. The iteration in pickNode() should be nearly unaffected, since these nodes are quite rare. jonpa: I gave this a try, but it thought it was a bit messy to flip the sorting variables back and…
				uweigandUnsubmitted Not Done Reply Inline Actions OK, that makes sense. uweigand: OK, that makes sense.
				if (lhs->isScheduleHigh && !rhs->isScheduleHigh)
				return true;
				if (!lhs->isScheduleHigh && rhs->isScheduleHigh)
				return false;

				if (lhs->getHeight() > rhs->getHeight())
				return true;
				else if (lhs->getHeight() < rhs->getHeight())
				return false;

				return (lhs->NodeNum < rhs->NodeNum);
				}
				};
				// A set of SUs with a sorter and dump method.
				struct SUSet : std::set<SUnit*, SUSorter> {
				#ifndef NDEBUG
				void dump(SystemZHazardRecognizer &HazardRec);
				#endif
				};

				/// The set of available SUs to schedule next.
				SUSet Available;

				// HazardRecognizer that tracks the scheduler state for the current
				// region.
				SystemZHazardRecognizer HazardRec;

				public:
				SystemZPostRASchedStrategy(const MachineSchedContext *C);

				/// PostRA scheduling does not track pressure.
				bool shouldTrackPressure() const override { return false; }

				/// Initialize the strategy after building the DAG for a new region.
				void initialize(ScheduleDAGMI *dag) override;

				/// Pick the next node to schedule, or return NULL.
				SUnit *pickNode(bool &IsTopNode) override;

				/// ScheduleDAGMI has scheduled an instruction - tell HazardRec
				/// about it.
				void schedNode(SUnit *SU, bool IsTopNode) override;

				/// SU has had all predecessor dependencies resolved. Put it into
				/// Available.
				void releaseTopNode(SUnit *SU) override;

				/// Currently only scheduling top-down, so this method is empty.
				void releaseBottomNode(SUnit *SU) override {};
				};

				} // namespace llvm

				#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H */

lib/Target/SystemZ/SystemZMachineScheduler.cpp

This file was added.

				//-- SystemZMachineScheduler.cpp - SystemZ Scheduler Interface -- C++ ----==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// -------------------------- Post RA scheduling ---------------------------- //
				// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into
				// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
				// implementation that looks to optimize decoder grouping and balance the
				// usage of processor resources.
				//===----------------------------------------------------------------------===//

				#include "SystemZMachineScheduler.h"

				using namespace llvm;

				#define DEBUG_TYPE "misched"

				#ifndef NDEBUG
				// Print the set of SUs
				void SystemZPostRASchedStrategy::SUSet::
				dump(SystemZHazardRecognizer &HazardRec) {
				dbgs() << "{";
				for (auto &SU : *this) {
				HazardRec.dumpSU(SU, dbgs());
				if (SU != *rbegin())
				dbgs() << ", ";
				}
				dbgs() << "}\n";
				}
				#endif

				SystemZPostRASchedStrategy::
				SystemZPostRASchedStrategy(const MachineSchedContext *C)
				: DAG(nullptr), HazardRec(C) {}

				void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) {
				DAG = dag;
				HazardRec.setDAG(dag);
				HazardRec.Reset();
				}

				// Pick the next node to schedule.
				SUnit *SystemZPostRASchedStrategy::pickNode(bool &IsTopNode) {
				// Only scheduling top-down.
				IsTopNode = true;

				if (Available.empty())
				return nullptr;

				// If only one choice, return it.
				if (Available.size() == 1) {
				DEBUG (dbgs() << "+++ Only one: ";
				HazardRec.dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);
				return *Available.begin();
				}

				// All nodes that are possible to schedule are stored by in the
				// Available set.
				DEBUG(dbgs() << "+++ Available: "; Available.dump(HazardRec););

				Candidate Best;
				for (auto *SU : Available) {

				// SU is the next candidate to be compared against current Best.
				Candidate c(SU, HazardRec);

				// Remeber which SU is the best candidate.
				if (Best.SU == nullptr \|\| c < Best) {
				Best = c;
				DEBUG(dbgs() << "+++ Best sofar: ";
				HazardRec.dumpSU(Best.SU, dbgs());
				if (Best.GroupingCost != 0)
				dbgs() << "\tGrouping cost:" << Best.GroupingCost;
				if (Best.ResourcesCost != 0)
				dbgs() << " Resource cost:" << Best.ResourcesCost;
				dbgs() << " Height:" << Best.SU->getHeight();
				dbgs() << "\n";);
				}

				// Once we know we have seen all SUs that affect grouping or use unbuffered
				// resources, we can stop iterating if Best looks good.
				if (!SU->isScheduleHigh && Best.noCost())
				break;
				}

				assert (Best.SU != nullptr);
				return Best.SU;
				}

				SystemZPostRASchedStrategy::Candidate::
				Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec) : Candidate() {
				SU = SU_;

				// Check the grouping cost. For a node that must begin / end a
				// group, it is positive if it would do so prematurely, or negative
				// if it would fit naturally into the schedule.
				GroupingCost = HazardRec.groupingCost(SU);

				// Check the resources cost for this SU.
				ResourcesCost = HazardRec.resourcesCost(SU);
				}

				bool SystemZPostRASchedStrategy::Candidate::
				operator<(const Candidate &other) {

				// Check decoder grouping.
				if (GroupingCost < other.GroupingCost)
				return true;
				if (GroupingCost > other.GroupingCost)
				return false;

				// Compare the use of resources.
				if (ResourcesCost < other.ResourcesCost)
				return true;
				if (ResourcesCost > other.ResourcesCost)
				return false;

				// Higher SU is otherwise generally better.
				if (SU->getHeight() > other.SU->getHeight())
				return true;
				if (SU->getHeight() < other.SU->getHeight())
				return false;

				// If all same, fall back to original order.
				if (SU->NodeNum < other.SU->NodeNum)
				return true;

				return false;
				}

				void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {
				DEBUG(dbgs() << "+++ Scheduling SU(" << SU->NodeNum << ")\n";);

				// Remove SU from Available set and update HazardRec.
				Available.erase(SU);
				HazardRec.EmitInstruction(SU);
				}

				void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {
				// Set isScheduleHigh flag on all SUs that we want to consider first in
				// pickNode().
				const MCSchedClassDesc *SC = DAG->getSchedClass(SU);
				bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup));
				SU->isScheduleHigh = (AffectsGrouping \|\| SU->isUnbuffered);

				// Put all released SUs in the Available set.
				Available.insert(SU);
				}

lib/Target/SystemZ/SystemZProcessors.td

	Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	def FeatureVector : SystemZFeature<			def FeatureVector : SystemZFeature<
	"vector", "Vector",			"vector", "Vector",
	"Assume that the vectory facility is installed"			"Assume that the vectory facility is installed"
	>;			>;
	def FeatureNoVector : SystemZMissingFeature<"Vector">;			def FeatureNoVector : SystemZMissingFeature<"Vector">;

	def : Processor<"generic", NoItineraries, []>;			def : Processor<"generic", NoItineraries, []>;
	def : Processor<"z10", NoItineraries, []>;			def : Processor<"z10", NoItineraries, []>;
	def : Processor<"z196", NoItineraries,			def : ProcessorModel<"z196", Z196Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1]>;			FeatureFastSerialization, FeatureInterlockedAccess1]>;
	def : Processor<"zEC12", NoItineraries,			def : ProcessorModel<"zEC12", ZEC12Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1,			FeatureFastSerialization, FeatureInterlockedAccess1,
	FeatureMiscellaneousExtensions,			FeatureMiscellaneousExtensions,
	FeatureTransactionalExecution, FeatureProcessorAssist]>;			FeatureTransactionalExecution, FeatureProcessorAssist]>;
	def : Processor<"z13", NoItineraries,
				def : ProcessorModel<"z13", Z13Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1,			FeatureFastSerialization, FeatureInterlockedAccess1,
	FeatureMiscellaneousExtensions,			FeatureMiscellaneousExtensions,
	FeatureTransactionalExecution, FeatureProcessorAssist,			FeatureTransactionalExecution, FeatureProcessorAssist,
	FeatureVector, FeatureLoadStoreOnCond2]>;			FeatureVector, FeatureLoadStoreOnCond2]>;

lib/Target/SystemZ/SystemZSchedule.td

This file was added.

				//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----- tblgen --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				// Scheduler resources

				// These three resources are used to express decoder grouping rules.
				// The number of decoder slots needed by an instructions is normally
				// one. For a cracked instruction (BeginGroup && !EndGroup) it is
				// two. Expanded instructions (BeginGroup && EndGroup) group alone.
				def GroupAlone : SchedWrite;
				def BeginGroup : SchedWrite;
				def EndGroup : SchedWrite;

				// Latencies, to make code a bit neater. If more than one resource is
				// used for an instruction, the greatest latency (not the sum) will be
				// output by Tablegen. Therefore, in such cases one of these resources
				// is needed.
				def Lat2 : SchedWrite;
				def Lat3 : SchedWrite;
				def Lat4 : SchedWrite;
				def Lat5 : SchedWrite;
				def Lat6 : SchedWrite;
				def Lat7 : SchedWrite;
				def Lat8 : SchedWrite;
				def Lat9 : SchedWrite;
				def Lat10 : SchedWrite;
				def Lat11 : SchedWrite;
				def Lat12 : SchedWrite;
				def Lat15 : SchedWrite;
				def Lat20 : SchedWrite;
				def Lat30 : SchedWrite;

				// Fixed-point
				def FXa : SchedWrite;
				def FXb : SchedWrite;
				def FXU : SchedWrite;

				// Load/store unit
				def LSU : SchedWrite;

				// Model a return without latency, otherwise if-converter will model
				// extra cost and abort (currently there is an assert that checks that
				// all instructions have at least one uop).
				def LSU_lat1 : SchedWrite;

				// Floating point unit (zEC12 and earlier)
				def FPU : SchedWrite;

				// Vector sub units (z13)
				def VecBF : SchedWrite;
				def VecDF : SchedWrite;
				def VecFPd : SchedWrite; // Blocking BFP div/sqrt unit.
				def VecMul : SchedWrite;
				def VecStr : SchedWrite;
				def VecXsPm : SchedWrite;

				// Virtual branching unit
				def VBU : SchedWrite;


				include "SystemZScheduleZ13.td"
				include "SystemZScheduleZEC12.td"
				include "SystemZScheduleZ196.td"

lib/Target/SystemZ/SystemZScheduleZ13.td

This file was added.

				//-- SystemZScheduleZ13.td - SystemZ Scheduling Definitions ----- tblgen --=//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for Z13 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def Z13Model : SchedMachineModel {

				let IssueWidth = 6; // 2 * 3 instructions decoded per cycle.
				let MicroOpBufferSize = 60; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;
				}

				let SchedModel = Z13Model in {
				uweigandUnsubmitted Done Reply Inline Actions I think this is the default setting, so we can just leave it out here. uweigand: I think this is the default setting, so we can just leave it out here.
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions aah, right. jonpa: aah, right.

				uweigandUnsubmitted Done Reply Inline Actions We should really try to get this complete, so that instructions added in the future don't accidentally lack scheduling information. How difficult would it be to get there? uweigand: We should really try to get this complete, so that instructions added in the future don't…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set CompleteModel (the reason I did not do that before is that AFAICR, this flag then also demanded modeling of operand writes or something of that sort). Worthy of mentioning is that this triggers an error during build for any instruction missing scheduling input for all subtargets. In a debug build, TargetSchedule.cpp will then cause an abort during compilation if the subtarget does not have scheduling input for an emitted instruction. jonpa: I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set…
				uweigandUnsubmitted Not Done Reply Inline Actions Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in the not-too-distant future we'll have completed the instruction set for the old processors anyway ... uweigand: Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I found that my previous statement on the compile time checking for scheduling input was not actually correct - this does not really cover all instructions: Currently asserts trigger only if All subtargets have omitted the instruction from Schedule .td files. TableGen catches this, and only this because it isn't clever enough to know if a given subtarget (with a missing sched class for an instruction), actually supports that instruction or not. An instruction has a def operand and the sched class does not have a WriteLatency entry for it. This is a specialized assert (in computeOperandLatency()) which doesn't cover all instructions - I could for instance remove the InstRW for a compare and not get any assert triggering. So there really isn't any general assert that checks that for a subtarget with a CompleteModel, all instructions actually emitted have a sane scheduling class. What happened if I removed the InstRW for an instruction, was that it got its own (valid) schedclass for the subtarget, but with just 0 values. I think we could catch the error of forgetting a subtarget/instruction sched annotation with an assert in the scheduler that demands at least one uop for any target instruction. This has at least worked well during my experiments previously. Should I add this just in the SystemZ backend? Or could it be part of the common code somewhere? (I am thinking this should be done both pre-ra and post-ra). Or is there any reason not to demand this that I have missed? jonpa: I found that my previous statement on the compile time checking for scheduling input was not…
				uweigandUnsubmitted Not Done Reply Inline Actions I don't think doing it in the backend is the right place. In the backend, you only see the instructions that the code being compiled happens to use; and when you do find an error there, there's not much you can do. The right place does seem to be TableGen. And in fact, I had interpreted the code in CodeGenSchedModels::checkCompleteness to do just that check. If this doesn't work as expected, it probably ought to be fixed there. But in any case this is a separate problem and shouldn't hold up this patch. uweigand: I don't think doing it in the backend is the right place. In the backend, you only see the…
				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<BeginGroup, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				let NumMicroOps = 0;
				let EndGroup = 1;
				}
				atrickUnsubmitted Not Done Reply Inline Actions In-order scheduling with multiple functional units of the same type is somewhat broken in the generic scheduler. ReservedCycles is only tracking a worst-cast resource availability across all units. It should really be a two-dimensional array. I know this has been fixed out-of-tree at least one target but never pushed back. It would be great if you fixed that! atrick: In-order scheduling with multiple functional units of the same type is somewhat broken in the…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions This unit is handled as a special case by the HazardRecognizer, I guess partly because I couldn't see that the code - as you say - did what I wanted. That would be interesting to fix... could perhaps this out-of-tree target push this possibly? jonpa: This unit is handled as a special case by the HazardRecognizer, I guess partly because I…
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				atrickUnsubmitted Not Done Reply Inline Actions It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree code. Out-of-tree work usually leads to problems. On the other hand, making it easy to write custom, possibly out-of-tree schedulers was a major goal of MI scheduling. atrick: It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree…
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def Z13_FXaUnit : ProcResource<2>;
				def Z13_FXbUnit : ProcResource<2>;
				def Z13_LSUnit : ProcResource<2>;
				def Z13_VecBFUnit : ProcResource<2>;
				def Z13_VecDFUnit : ProcResource<2>;
				def Z13_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ }
				def Z13_VecMulUnit : ProcResource<2>;
				def Z13_VecStrUnit : ProcResource<2>;
				def Z13_VecXsPmUnit : ProcResource<2>;
				def Z13_VBUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXa, [Z13_FXaUnit]> { let Latency = 1; }
				def : WriteRes<FXb, [Z13_FXbUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [Z13_LSUnit]> { let Latency = 4; }
				def : WriteRes<VecBF, [Z13_VecBFUnit]> { let Latency = 8; }
				def : WriteRes<VecDF, [Z13_VecDFUnit]>;
				def : WriteRes<VecFPd, [Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit]>
				{ let Latency = 30; }
				def : WriteRes<VecMul, [Z13_VecMulUnit]> { let Latency = 5; }
				def : WriteRes<VecStr, [Z13_VecStrUnit]> { let Latency = 4; }
				def : WriteRes<VecXsPm, [Z13_VecXsPmUnit]> { let Latency = 3; }
				def : WriteRes<VBU, [Z13_VBUnit]>; // Virtual Branching Unit

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				//===----------------------------------------------------------------------===//
				// Stack allocation
				//===----------------------------------------------------------------------===//
				uweigandUnsubmitted Done Reply Inline Actions Ideally, the ordering in these files should mostly correspond to the ordering in the original InstrInfo files, just to make them easier to find ... uweigand: Ideally, the ordering in these files should mostly correspond to the ordering in the original…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I have reorganized the files completely to match the InstrInfo file sections. jonpa: I have reorganized the files completely to match the InstrInfo file sections.
				uweigandUnsubmitted Not Done Reply Inline Actions Excellent, it is much more readable now! uweigand: Excellent, it is much more readable now!

				uweigandUnsubmitted Not Done Reply Inline Actions Also, it is somewhat annoying that we need to list not just the basic instruction definitions, but all the various aliases as well. I'm wondering if there isn't some what to annotate the Alias definition in the main file with the opcode the alias will be resolved in the end, so that can be used for scheduling purposes ... uweigand: Also, it is somewhat annoying that we need to list not just the basic instruction definitions…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Could not as of yet find any way to achieve this, but there might be some way... jonpa: Could not as of yet find any way to achieve this, but there might be some way...
				uweigandUnsubmitted Done Reply Inline Actions Oh well, if there's no straightforward way, it's OK the way it is now ... uweigand: Oh well, if there's no straightforward way, it's OK the way it is now ...
				def : InstRW<[FXa], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY

				//===----------------------------------------------------------------------===//
				// Control flow instructions
				//===----------------------------------------------------------------------===//

				// Return
				def : InstRW<[FXb, EndGroup], (instregex "Return$")>;
				def : InstRW<[FXb], (instregex "CondReturn$")>;

				// Compare and branch
				def : InstRW<[FXb], (instregex "(Asm.*)?C(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CG(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CL(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CLG(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CG(R\|I)J$")>;
				def : InstRW<[FXb], (instregex "CLR$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CLIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CGIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CGRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CLRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "(Asm.*)?CRB(Call\|Return)?$")>;

				// Branch
				def : InstRW<[FXb], (instregex "(Asm.*)?BR$")>;
				def : InstRW<[FXb], (instregex "(Asm)?BC(R)?$")>;
				def : InstRW<[VBU], (instregex "(Asm)?BRC(L)?$")>;
				def : InstRW<[FXa, EndGroup], (instregex "BRCT(G)?$")>;
				def : InstRW<[VBU], (instregex "(Asm.*)?JG$")>;
				def : InstRW<[VBU], (instregex "J$")>;
				// (Need to avoid conflict with "(Asm.*)?CG(I\|R)J$")
				def : InstRW<[VBU], (instregex "Asm(EAlt\|E\|HAlt\|HE\|H\|LAlt\|LE\|LH\|L\|NEAlt\|NE)J$")>;
				def : InstRW<[VBU], (instregex "Asm(NHAlt\|NHE\|NH\|NLAlt\|NLE\|NLH\|NL\|NO\|O)J$")>;
				def : InstRW<[FXa, FXa, FXb, FXb, Lat4, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Trap
				def : InstRW<[VBU], (instregex "(Cond)?Trap$")>;

				// Compare and trap
				def : InstRW<[FXb], (instregex "(Asm.*)?C(G)?IT$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?C(G)?RT$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CLG(I\|R)T$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CLFIT$")>;
				def : InstRW<[FXb], (instregex "(Asm.*)?CLRT$")>;

				//===----------------------------------------------------------------------===//
				// Select instructions
				//===----------------------------------------------------------------------===//

				// Select pseudo
				def : InstRW<[FXa], (instregex "Select(32\|64\|32Mux)$")>;

				// CondStore pseudos
				def : InstRW<[FXa], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore8Mux(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// Call instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[VBU, FXa, FXa, Lat3, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[FXb], (instregex "CallB(C)?R$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[VBU], (instregex "CallBRCL$")>;
				def : InstRW<[VBU], (instregex "CallJG$")>;

				//===----------------------------------------------------------------------===//
				// Move instructions
				//===----------------------------------------------------------------------===//

				// Moves
				def : InstRW<[FXb, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[FXb, LSU, LSU, LSU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Pseudo -> reg move
				def : InstRW<[FXa], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXa], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXa], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXa], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXa], (instregex "SUBREG_TO_REG$")>;

				// Loads
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux\|CBB)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "L128$")>;

				def : InstRW<[FXa], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "LLIL(F\|H\|L)$")>;

				def : InstRW<[FXa], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXa], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXa], (instregex "LR(Mux)?$")>;

				// Load and test
				def : InstRW<[FXa, LSU, Lat5], (instregex "LT(G)?$")>;
				def : InstRW<[FXa], (instregex "LT(G)?R$")>;

				// Load on condition
				def : InstRW<[FXa, LSU, Lat6], (instregex "(Asm.*)?LOC(G)?$")>;
				def : InstRW<[FXa, Lat2], (instregex "(Asm.*)?LOC(G)?R$")>;
				def : InstRW<[FXa, Lat2], (instregex "(Asm.*)?LOC(G)?HI$")>;

				// Stores
				def : InstRW<[FXb, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "ST128$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;

				uweigandUnsubmitted Done Reply Inline Actions This is not an "And", it's a non-transactional store and should go with the transaction-related instructions. uweigand: This is not an "And", it's a non-transactional store and should go with the transaction-related…
				// Store on condition
				def : InstRW<[FXb, LSU, Lat5], (instregex "(Asm.*)?STOC(G)?$")>;

				// String moves.
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;

				//===----------------------------------------------------------------------===//
				// Sign extensions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "L(B\|H\|G)R$")>;
				def : InstRW<[FXa], (instregex "LG(B\|H\|F)R$")>;

				def : InstRW<[FXa, LSU, Lat5], (instregex "LTGF$")>;
				def : InstRW<[FXa], (instregex "LTGFR$")>;

				def : InstRW<[FXa, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LH(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LH(H\|Mux\|RL)$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LG(B\|H\|F)$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LG(H\|F)RL$")>;

				//===----------------------------------------------------------------------===//
				// Zero extensions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXa], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXa], (instregex "LLG(C\|H\|F)R$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLH(Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[LSU], (instregex "LLHRL$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|H\|F\|HRL\|FRL)$")>;

				//===----------------------------------------------------------------------===//
				// Truncations
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;

				//===----------------------------------------------------------------------===//
				// Multi-register moves
				//===----------------------------------------------------------------------===//

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Store multiple (estimated average of ceil(5/2) FXb ops)
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, Lat10,
				GroupAlone], (instregex "STM(G\|H\|Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Byte swaps
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "LRV(G)?R$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				//===----------------------------------------------------------------------===//
				// Load address instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "LA(Y\|RL)?$")>;

				// Load the Global Offset Table address ( -> larl )
				def : InstRW<[FXa], (instregex "GOT$")>;

				//===----------------------------------------------------------------------===//
				// Absolute and Negation
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, Lat2], (instregex "LP(G)?R$")>;
				def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXa, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXa], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXa, FXa, Lat2, BeginGroup], (instregex "LCGFR$")>;

				//===----------------------------------------------------------------------===//
				// Insertion
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXa], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXa], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "IILF(64)?$")>;
				def : InstRW<[FXa], (instregex "IILH(64)?$")>;
				def : InstRW<[FXa], (instregex "IILL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Addition
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "A(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat6], (instregex "AH(Y)?$")>;
				def : InstRW<[FXa], (instregex "AIH$")>;
				def : InstRW<[FXa], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "AG$")>;
				def : InstRW<[FXa], (instregex "AGFI$")>;
				def : InstRW<[FXa], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXa], (instregex "AGR(K)?$")>;
				def : InstRW<[FXa], (instregex "AHI(K)?$")>;
				def : InstRW<[FXa], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXa], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXa], (instregex "ALGHSIK$")>;
				def : InstRW<[FXa], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXa], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXa], (instregex "ALR(K)?$")>;
				def : InstRW<[FXa], (instregex "AR(K)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "A(G)?SI$")>;

				// Logical addition with carry
				def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXa, Lat2, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXa, LSU, Lat6], (instregex "AGF$")>;
				def : InstRW<[FXa, Lat2], (instregex "AGFR$")>;

				//===----------------------------------------------------------------------===//
				// Subtraction
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXa, LSU, Lat6], (instregex "SH(Y)?$")>;
				def : InstRW<[FXa], (instregex "SGR(K)?$")>;
				def : InstRW<[FXa], (instregex "SLFI$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXa], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXa], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXa], (instregex "SLR(K)?$")>;
				def : InstRW<[FXa], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXa, Lat2, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXa, LSU, Lat6], (instregex "SGF$")>;
				def : InstRW<[FXa, Lat2], (instregex "SGFR$")>;

				//===----------------------------------------------------------------------===//
				// AND
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				def : InstRW<[FXa], (instregex "NGR(K)?$")>;
				def : InstRW<[FXa], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXa], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "NILF(64)?$")>;
				def : InstRW<[FXa], (instregex "NILH(64)?$")>;
				def : InstRW<[FXa], (instregex "NILL(64)?$")>;
				def : InstRW<[FXa], (instregex "NR(K)?$")>;
				def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "NC$")>;

				//===----------------------------------------------------------------------===//
				// OR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXa], (instregex "OGR(K)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXa], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXa], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "OILF(64)?$")>;
				def : InstRW<[FXa], (instregex "OILH(64)?$")>;
				def : InstRW<[FXa], (instregex "OILL(64)?$")>;
				def : InstRW<[FXa], (instregex "OR(K)?$")>;
				def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "OC$")>;

				//===----------------------------------------------------------------------===//
				// XOR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXa], (instregex "XIFMux$")>;
				def : InstRW<[FXa], (instregex "XGR(K)?$")>;
				def : InstRW<[FXa], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "XILF(64)?$")>;
				def : InstRW<[FXa], (instregex "XR(K)?$")>;
				def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "XC$")>;

				//===----------------------------------------------------------------------===//
				// Multiplication
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXa, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXa, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXa, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXa, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXa, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXa, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXa, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXa, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXa, LSU, Lat9], (instregex "MH(Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Division and remainder
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa, Lat30, GroupAlone], (instregex "DSG(F)?R$")>;
				def : InstRW<[LSU, FXa, Lat30, GroupAlone], (instregex "DSG(F)?$")>;
				def : InstRW<[FXa, FXa, Lat20, GroupAlone], (instregex "DLR$")>;
				def : InstRW<[FXa, FXa, Lat30, GroupAlone], (instregex "DLGR$")>;
				def : InstRW<[FXa, FXa, LSU, Lat30, GroupAlone], (instregex "DL(G)?$")>;

				//===----------------------------------------------------------------------===//
				// Shifts
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXa, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXa], (instregex "RISBG(N\|32)?$")>;
				def : InstRW<[FXa], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "R(N\|O\|X)SBG$")>;

				//===----------------------------------------------------------------------===//
				// Comparison
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXb], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXb], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXb], (instregex "C(G)?R$")>;
				def : InstRW<[FXb], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXb], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXb], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXb], (instregex "CLGR$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXb], (instregex "CLIH$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXb], (instregex "CLR$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXb, LSU, Lat6], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXb, LSU, Lat6], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXa, FXb, LSU, Lat6, BeginGroup], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXb, LSU, Lat6], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXb, Lat2], (instregex "CGFR$")>;

				// Compare logical character
				def : InstRW<[FXb, LSU, LSU, Lat9, BeginGroup], (instregex "CLC$")>;

				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				// Test under mask
				def : InstRW<[FXb, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXb], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXb], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXb], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXb], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXb], (instregex "TMLL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Prefetch
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU], (instregex "PFD(RL)?$")>;

				//===----------------------------------------------------------------------===//
				// Atomic operations
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb, EndGroup], (instregex "Serialize$")>;

				def : InstRW<[FXb, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAX(G)?$")>;

				// Compare and swap
				def : InstRW<[FXa, FXb, LSU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Transactional execution
				//===----------------------------------------------------------------------===//

				// Transaction begin
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat15, GroupAlone],
				(instregex "TBEGIN(C\|_nofloat)?$")>;

				// Transaction end
				def : InstRW<[FXb, GroupAlone], (instregex "TEND$")>;

				// Transaction abort
				def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>;

				// Extract Transaction Nesting Depth
				def : InstRW<[FXa], (instregex "ETND$")>;

				// Nontransactional store
				def : InstRW<[FXb, LSU, Lat5], (instregex "NTSTG$")>;

				//===----------------------------------------------------------------------===//
				// Processor assist
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb], (instregex "PPA$")>;

				//===----------------------------------------------------------------------===//
				// Miscellaneous Instructions.
				//===----------------------------------------------------------------------===//

				// Insert Program Mask
				def : InstRW<[FXa, Lat3, EndGroup], (instregex "IPM$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Find leftmost one
				def : InstRW<[FXa, Lat6, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXa, Lat3], (instregex "POPCNT$")>;

				// Extend
				def : InstRW<[FXa], (instregex "AEXT128_64$")>;
				def : InstRW<[FXa], (instregex "ZEXT128_(32\|64)$")>;

				// String instructions
				uweigandUnsubmitted Done Reply Inline Actions It would be nice to at least separate out vector floating-point instructions, so we can easily see where W variants are needed. uweigand: It would be nice to at least separate out vector floating-point instructions, so we can easily…
				def : InstRW<[FXa, LSU, Lat30], (instregex "SRST$")>;

				// Move with key
				def : InstRW<[FXa, FXa, FXb, LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Extract CPU Time
				def : InstRW<[FXa, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[FXb, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXb, Lat30], (instregex "PR$")>;

				// Inline assembly
				def : InstRW<[LSU, LSU, LSU, FXa, FXa, FXb, Lat9, GroupAlone],
				(instregex "STCK(F)?$")>;
				def : InstRW<[LSU, LSU, LSU, LSU, FXa, FXa, FXb, FXb, Lat11, GroupAlone],
				(instregex "STCKE$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXb, Lat30], (instregex "SVC$")>;

				// Store real address
				def : InstRW<[FXb, LSU, Lat5], (instregex "STRAG$")>;

				//===----------------------------------------------------------------------===//
				// .insn directive instructions
				//===----------------------------------------------------------------------===//

				// An "empty" sched-class will be assigned instead of the "invalid sched-class".
				// getNumDecoderSlots() will then return 1 instead of 0.
				def : InstRW<[], (instregex "Insn.*")>;


				// ----------------------------- Floating point ----------------------------- //

				//===----------------------------------------------------------------------===//
				// FP: Select instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXa], (instregex "SelectF(32\|64\|128)$")>;
				def : InstRW<[FXa], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStoreF64(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Move instructions
				//===----------------------------------------------------------------------===//

				// Load zero
				def : InstRW<[FXb], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXb, FXb, Lat2, BeginGroup], (instregex "LZXR$")>;

				// Load
				def : InstRW<[VecXsPm], (instregex "LER$")>;
				def : InstRW<[FXb], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXb, Lat3], (instregex "LGDR$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load and Test
				def : InstRW<[VecXsPm, Lat4], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Copy sign
				def : InstRW<[VecXsPm], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[VecXsPm], (instregex "CPSDRs(d\|s)$")>;

				//===----------------------------------------------------------------------===//
				// FP: Load instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm, LSU, Lat7], (instregex "LE(Y)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Store instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXb, LSU, Lat7], (instregex "STE(Y)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Conversion instructions
				//===----------------------------------------------------------------------===//

				// Load rounded
				def : InstRW<[VecBF], (instregex "LEDBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat20], (instregex "LDXBR(A)?$")>;

				// Load lengthened
				def : InstRW<[VecBF, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[VecBF], (instregex "LDEBR$")>;
				def : InstRW<[VecBF, VecBF, LSU, Lat12 , GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[VecBF, VecBF, Lat9 , GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat12, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat12, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, GroupAlone], (instregex "CLFEBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLFDBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "CL(F\|G)XBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Unary arithmetic
				//===----------------------------------------------------------------------===//

				// Load Complement / Negative / Positive
				def : InstRW<[VecXsPm, Lat4], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXb], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXb], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXb], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Square root
				def : InstRW<[VecFPd, LSU], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[VecFPd], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "SQXBR$")>;

				// Load FP integer
				def : InstRW<[VecBF], (instregex "FIEBR(A)?$")>;
				def : InstRW<[VecBF], (instregex "FIDBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "FIXBR(A)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Binary arithmetic
				//===----------------------------------------------------------------------===//

				// Addition
				def : InstRW<[VecBF, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[VecBF], (instregex "A(E\|D)BR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[VecBF, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[VecBF], (instregex "S(E\|D)BR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[VecBF, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[VecBF], (instregex "M(D\|DE\|EE)BR$")>;
				uweigandUnsubmitted Done Reply Inline Actions I don't think there's a real difference between those and the ones listed under Other. uweigand: I don't think there's a real difference between those and the ones listed under Other.
				def : InstRW<[VecBF, VecBF, LSU, Lat12, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[VecBF, VecBF, Lat9, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[VecBF, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[VecFPd, LSU], (instregex "D(E\|D)B$")>;
				def : InstRW<[VecFPd], (instregex "D(E\|D)BR$")>;
				def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "DXBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Comparisons
				//===----------------------------------------------------------------------===//

				// Compare
				def : InstRW<[VecXsPm, LSU, Lat8], (instregex "C(E\|D)B$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "C(E\|D)BR?$")>;
				def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "CXBR$")>;

				// Test Data Class
				uweigandUnsubmitted Done Reply Inline Actions This is just an alias for a LARL and should go there. uweigand: This is just an alias for a LARL and should go there.
				def : InstRW<[LSU, VecXsPm, Lat9], (instregex "TC(E\|D)B$")>;
				def : InstRW<[LSU, VecDF, VecDF, Lat15, GroupAlone], (instregex "TCXB$")>;


				// --------------------------------- Vector --------------------------------- //

				//===----------------------------------------------------------------------===//
				// Vector: Move instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb], (instregex "VLR(32\|64)?$")>;
				def : InstRW<[FXb, Lat4], (instregex "VLGV(B\|F\|G\|H)$")>;
				def : InstRW<[FXb], (instregex "VLVG(B\|F\|G\|H)$")>;
				def : InstRW<[FXb, Lat2], (instregex "VLVGP(32)?$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Immediate instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm], (instregex "VZERO$")>;
				def : InstRW<[VecXsPm], (instregex "VONE$")>;
				def : InstRW<[VecXsPm], (instregex "VGBM$")>;
				def : InstRW<[VecXsPm], (instregex "VGM(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VLEI(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VREPI(B\|F\|G\|H)$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Loads
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU], (instregex "VL(L\|BB)?$")>;
				def : InstRW<[LSU], (instregex "VL(32\|64)$")>;
				def : InstRW<[LSU], (instregex "VLLEZ(B\|F\|G\|H)$")>;
				def : InstRW<[LSU], (instregex "VLREP(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, LSU, Lat7], (instregex "VLE(B\|F\|G\|H)$")>;
				def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VGE(F\|G)$")>;
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "VLM$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Stores
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb, LSU, Lat8], (instregex "VST(L\|32\|64)?$")>;
				def : InstRW<[FXb, LSU, Lat8], (instregex "VSTE(F\|G)$")>;
				def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VSTE(B\|H)$")>;
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat20, GroupAlone],
				(instregex "VSTM$")>;
				def : InstRW<[FXb, FXb, LSU, Lat12, BeginGroup], (instregex "VSCE(F\|G)$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Selects and permutes
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm], (instregex "VMRH(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMRL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VPERM$")>;
				def : InstRW<[VecXsPm], (instregex "VPDI$")>;
				def : InstRW<[VecXsPm], (instregex "VREP(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VSEL$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Widening and narrowing
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm], (instregex "VPK(F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VPKS(F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VPKS(F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VPKLS(F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VPKLS(F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VSEG(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPH(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPL(B\|F)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPLH(B\|F\|H\|W)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPLL(B\|F\|H)$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Integer arithmetic
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm], (instregex "VA(B\|F\|G\|H\|Q\|CQ)$")>;
				def : InstRW<[VecXsPm], (instregex "VACC(B\|F\|G\|H\|Q\|CQ)$")>;
				def : InstRW<[VecXsPm], (instregex "VAVG(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VAVGL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VN(C\|O)?$")>;
				def : InstRW<[VecXsPm], (instregex "VO$")>;
				def : InstRW<[VecMul], (instregex "VCKSM$")>;
				def : InstRW<[VecXsPm], (instregex "VCLZ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VCTZ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VX$")>;
				def : InstRW<[VecMul], (instregex "VGFMA(B\|F\|G\|H)$")>;
				def : InstRW<[VecMul], (instregex "VGFM(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VLC(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VLP(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMX(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMXL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMN(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMNL(B\|F\|G\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAL(B\|F)$")>;
				def : InstRW<[VecMul], (instregex "VMALE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMALH(B\|F\|H\|W)$")>;
				def : InstRW<[VecMul], (instregex "VMALO(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAO(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAH(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VME(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMH(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VML(B\|F)$")>;
				def : InstRW<[VecMul], (instregex "VMLE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMLH(B\|F\|H\|W)$")>;
				def : InstRW<[VecMul], (instregex "VMLO(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMO(B\|F\|H)$")>;

				def : InstRW<[VecXsPm], (instregex "VPOPCT$")>;

				def : InstRW<[VecXsPm], (instregex "VERLL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VERLLV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VERIM(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESLV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRA(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRAV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRLV(B\|F\|G\|H)$")>;

				def : InstRW<[VecXsPm], (instregex "VSL(DB)?$")>;
				def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSLB$")>;
				def : InstRW<[VecXsPm], (instregex "VSR(A\|L)$")>;
				def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSR(A\|L)B$")>;

				def : InstRW<[VecXsPm], (instregex "VSB(IQ\|CBIQ)?$")>;
				def : InstRW<[VecXsPm], (instregex "VSCBI(B\|F\|G\|H\|Q)$")>;
				def : InstRW<[VecXsPm], (instregex "VS(F\|G\|H\|Q)$")>;

				def : InstRW<[VecMul], (instregex "VSUM(B\|H)$")>;
				def : InstRW<[VecMul], (instregex "VSUMG(F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VSUMQ(F\|G)$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Integer comparison
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm, Lat4], (instregex "VEC(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VECL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VCEQ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCEQ(B\|F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VCH(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCH(B\|F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VCHL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCHL(B\|F\|G\|H)S$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VTM$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Floating-point arithmetic
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecBF], (instregex "VCD(GB\|LGB)$")>;
				def : InstRW<[VecBF], (instregex "WCD(GB\|LGB)$")>;
				def : InstRW<[VecBF], (instregex "(V\|W)FADB$")>;
				def : InstRW<[VecBF], (instregex "(V\|W)CGDB$")>;
				def : InstRW<[VecBF], (instregex "VF(I\|M\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "WF(I\|M\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "(V\|W)CLGDB$")>;
				def : InstRW<[VecXsPm], (instregex "VFL(C\|N\|P)DB$")>;
				def : InstRW<[VecXsPm], (instregex "WFL(C\|N\|P)DB$")>;
				def : InstRW<[VecBF], (instregex "VFM(A\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "WFM(A\|S)DB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VFTCIDB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WFTCIDB$")>;
				def : InstRW<[VecBF], (instregex "VL(DE\|ED)B$")>;
				def : InstRW<[VecBF], (instregex "WL(DE\|ED)B$")>;

				// divide / square root
				def : InstRW<[VecFPd], (instregex "(V\|W)FDDB$")>;
				def : InstRW<[VecFPd], (instregex "(V\|W)FSQDB$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Floating-point comparison
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecXsPm], (instregex "VFC(E\|H\|HE)DB$")>;
				def : InstRW<[VecXsPm], (instregex "WFC(E\|H\|HE)DB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VFC(E\|H\|HE)DBS$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WFC(E\|H\|HE)DBS$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WF(C\|K)DB$")>;

				//===----------------------------------------------------------------------===//
				// Vector: Floating-point insertion and extraction
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXb], (instregex "LEFR$")>;
				def : InstRW<[FXb, Lat4], (instregex "LFER$")>;

				//===----------------------------------------------------------------------===//
				// Vector: String instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[VecStr], (instregex "VFAEB$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAEBS$")>;
				def : InstRW<[VecStr], (instregex "VFAE(F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAE(F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VFAEZ(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAEZ(B\|F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VFEE(B\|F\|H\|ZB\|ZF\|ZH)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFEE(B\|F\|H\|ZB\|ZF\|ZH)S$")>;
				def : InstRW<[VecStr], (instregex "VFENE(B\|F\|H\|ZB\|ZF\|ZH)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFENE(B\|F\|H\|ZB\|ZF\|ZH)S$")>;
				def : InstRW<[VecStr], (instregex "VISTR(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VISTR(B\|F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VSTRC(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VSTRC(B\|F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VSTRCZ(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VSTRCZ(B\|F\|H)S$")>;

				}

lib/Target/SystemZ/SystemZScheduleZ196.td

This file was added.

				//=- SystemZScheduleZ196.td - SystemZ Scheduling Definitions ---- tblgen --=//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for Z196 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def Z196Model : SchedMachineModel {

				let IssueWidth = 3; // 3 instructions decoded per cycle.
				let MicroOpBufferSize = 40; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;
				}

				let SchedModel = Z196Model in {

				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				let NumMicroOps = 0;
				let EndGroup = 1;
				}
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def Z196_FXUnit : ProcResource<1>;
				def Z196_LSUnit : ProcResource<1>;
				def Z196_FPUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXU, [Z196_FXUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [Z196_LSUnit]> { let Latency = 4; }
				def : WriteRes<LSU_lat1, [Z196_LSUnit]> { let Latency = 1; }
				def : WriteRes<FPU, [Z196_FPUnit]> { let Latency = 8; }

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				uweigandUnsubmitted Done Reply Inline Actions What's the EC12 doing here? uweigand: What's the EC12 doing here?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Good heavens! jonpa: Good heavens!
				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				//===----------------------------------------------------------------------===//
				// Stack allocation
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY

				//===----------------------------------------------------------------------===//
				// Control flow instructions
				//===----------------------------------------------------------------------===//

				// Return
				def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>;
				def : InstRW<[LSU_lat1, EndGroup], (instregex "CondReturn$")>;

				// Compare and branch
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?C(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CG(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CL(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLG(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CRB(Call\|Return)?$")>;

				// Branch
				def : InstRW<[LSU, EndGroup], (instregex "(Asm.*)?BR$")>;
				def : InstRW<[LSU, EndGroup], (instregex "(Asm)?BC(R)?$")>;
				def : InstRW<[LSU, EndGroup], (instregex "(Asm)?BRC(L)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BRCT(G)?$")>;
				def : InstRW<[LSU, EndGroup], (instregex "(Asm.*)?JG$")>;
				def : InstRW<[LSU, EndGroup], (instregex "J$")>;
				// (Need to avoid conflict with "(Asm.*)?CG(I\|R)J$")
				def : InstRW<[LSU, EndGroup], (instregex "Asm(EAlt\|E\|HAlt\|HE\|H\|LAlt\|LE\|LH\|L\|NEAlt\|NE)J$")>;
				def : InstRW<[LSU, EndGroup], (instregex "Asm(NHAlt\|NHE\|NH\|NLAlt\|NLE\|NLH\|NL\|NO\|O)J$")>;
				def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Trap
				def : InstRW<[LSU, EndGroup], (instregex "(Cond)?Trap$")>;

				// Compare and trap
				def : InstRW<[FXU], (instregex "(Asm.*)?C(G)?IT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?C(G)?RT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLG(I\|R)T$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLFIT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLRT$")>;

				//===----------------------------------------------------------------------===//
				// Select instructions
				//===----------------------------------------------------------------------===//

				// Select pseudo
				def : InstRW<[FXU], (instregex "Select(32\|64\|32Mux)$")>;

				// CondStore pseudos
				def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// Call instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallB(C)?R$")>;
				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallBRCL$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallJG$")>;

				//===----------------------------------------------------------------------===//
				// Move instructions
				//===----------------------------------------------------------------------===//

				// Moves
				def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Pseudo -> reg move
				def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>;

				// Loads
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "L128$")>;

				def : InstRW<[FXU], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LLIL(F\|H\|L)$")>;

				def : InstRW<[FXU], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXU], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LR(Mux)?$")>;

				// Load and test
				def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G)?$")>;
				def : InstRW<[FXU], (instregex "LT(G)?R$")>;

				// Load on condition
				def : InstRW<[FXU, LSU, Lat6, EndGroup], (instregex "(Asm.*)?LOC(G)?$")>;
				def : InstRW<[FXU, Lat2, EndGroup], (instregex "(Asm.*)?LOC(G)?R$")>;

				// Stores
				def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST128$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;

				// Store on condition
				def : InstRW<[FXU, LSU, Lat5, EndGroup], (instregex "(Asm.*)?STOC(G)?$")>;

				// String moves.
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;

				//===----------------------------------------------------------------------===//
				// Sign extensions
				//===----------------------------------------------------------------------===//
				def : InstRW<[FXU], (instregex "L(B\|H\|G)R$")>;
				def : InstRW<[FXU], (instregex "LG(B\|H\|F)R$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LTGF$")>;
				def : InstRW<[FXU], (instregex "LTGFR$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H\|Mux\|RL)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(B\|H\|F)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(H\|F)RL$")>;

				//===----------------------------------------------------------------------===//
				// Zero extensions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLG(C\|F\|H)R$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLH(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[LSU], (instregex "LLHRL$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|F\|H\|FRL\|HRL)$")>;

				//===----------------------------------------------------------------------===//
				// Truncations
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;

				//===----------------------------------------------------------------------===//
				// Multi-register moves
				//===----------------------------------------------------------------------===//

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Store multiple (estimated average of 3 ops)
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone],
				(instregex "STM(H\|Y\|G)?$")>;

				//===----------------------------------------------------------------------===//
				// Byte swaps
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LRV(G)?R$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				//===----------------------------------------------------------------------===//
				// Load address instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LA(Y\|RL)?$")>;

				// Load the Global Offset Table address
				def : InstRW<[FXU], (instregex "GOT$")>;

				//===----------------------------------------------------------------------===//
				// Absolute and Negation
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>;
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXU, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXU], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>;

				//===----------------------------------------------------------------------===//
				// Insertion
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXU], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "IILF(64)?$")>;
				def : InstRW<[FXU], (instregex "IILH(64)?$")>;
				def : InstRW<[FXU], (instregex "IILL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Addition
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y\|SI)?$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AH(Y)?$")>;
				def : InstRW<[FXU], (instregex "AIH$")>;
				def : InstRW<[FXU], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "AGFI$")>;
				def : InstRW<[FXU], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AGR(K)?$")>;
				def : InstRW<[FXU], (instregex "AHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXU], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXU], (instregex "ALGHSIK$")>;
				def : InstRW<[FXU], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXU], (instregex "ALR(K)?$")>;
				def : InstRW<[FXU], (instregex "AR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>;

				// Logical addition with carry
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AGF$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "AGFR$")>;

				//===----------------------------------------------------------------------===//
				// Subtraction
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SH(Y)?$")>;
				def : InstRW<[FXU], (instregex "SGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLFI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLR(K)?$")>;
				def : InstRW<[FXU], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SGF$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "SGFR$")>;

				//===----------------------------------------------------------------------===//
				// AND
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "NGR(K)?$")>;
				def : InstRW<[FXU], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXU], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "NILF(64)?$")>;
				def : InstRW<[FXU], (instregex "NILH(64)?$")>;
				def : InstRW<[FXU], (instregex "NILL(64)?$")>;
				def : InstRW<[FXU], (instregex "NR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "NC$")>;

				//===----------------------------------------------------------------------===//
				// OR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "OGR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXU], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "OILF(64)?$")>;
				def : InstRW<[FXU], (instregex "OILH(64)?$")>;
				def : InstRW<[FXU], (instregex "OILL(64)?$")>;
				def : InstRW<[FXU], (instregex "OR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "OC$")>;

				//===----------------------------------------------------------------------===//
				// XOR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXU], (instregex "XIFMux$")>;
				def : InstRW<[FXU], (instregex "XGR(K)?$")>;
				def : InstRW<[FXU], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "XILF(64)?$")>;
				def : InstRW<[FXU], (instregex "XR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "XC$")>;

				//===----------------------------------------------------------------------===//
				// Multiplication
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXU, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXU, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXU, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXU, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXU, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Division and remainder
				//===----------------------------------------------------------------------===//

				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?$")>;
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?$")>;

				//===----------------------------------------------------------------------===//
				// Shifts
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXU], (instregex "RISBG(32)?$")>;
				def : InstRW<[FXU], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N\|O\|X)SBG$")>;

				//===----------------------------------------------------------------------===//
				// Comparison
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXU], (instregex "C(G)?R$")>;
				def : InstRW<[FXU], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXU], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "CLGR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXU], (instregex "CLIH$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, Lat2, GroupAlone], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "CGFR$")>;

				// Compare logical character
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "CLC$")>;

				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				// Test under mask
				def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXU], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Prefetch
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU, GroupAlone], (instregex "PFD(RL)?$")>;

				//===----------------------------------------------------------------------===//
				// Atomic operations
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>;

				// Compare and swap
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Miscellaneous Instructions.
				//===----------------------------------------------------------------------===//

				// Insert Program Mask
				def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Find leftmost one
				def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>;

				// Extend
				def : InstRW<[FXU], (instregex "AEXT128_64$")>;
				def : InstRW<[FXU], (instregex "ZEXT128_(32\|64)$")>;

				// String instructions
				def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>;

				// Move with key
				def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Extract CPU Time
				def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXU, Lat30], (instregex "PR$")>;

				// Inline assembly
				def : InstRW<[FXU, LSU, Lat15], (instregex "STCK$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "STCKF$")>;
				def : InstRW<[LSU, FXU, Lat5], (instregex "STCKE$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXU, Lat30], (instregex "SVC$")>;

				// Store real address
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>;

				//===----------------------------------------------------------------------===//
				// .insn directive instructions
				//===----------------------------------------------------------------------===//

				// An "empty" sched-class will be assigned instead of the "invalid sched-class".
				// getNumDecoderSlots() will then return 1 instead of 0.
				def : InstRW<[], (instregex "Insn.*")>;


				// ----------------------------- Floating point ----------------------------- //

				//===----------------------------------------------------------------------===//
				// FP: Select instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "SelectF(32\|64\|128)$")>;
				def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Move instructions
				//===----------------------------------------------------------------------===//

				// Load zero
				def : InstRW<[FXU], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>;

				// Load
				def : InstRW<[FXU], (instregex "LER$")>;
				def : InstRW<[FXU], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXU, Lat3], (instregex "LGDR$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load and Test
				def : InstRW<[FPU], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Copy sign
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d\|s)$")>;

				//===----------------------------------------------------------------------===//
				// FP: Load instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU], (instregex "LE(Y)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Store instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Conversion instructions
				//===----------------------------------------------------------------------===//

				// Load rounded
				def : InstRW<[FPU], (instregex "LEDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>;

				// Load lengthened
				def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[FPU], (instregex "LDEBR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F\|G)XBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Unary arithmetic
				//===----------------------------------------------------------------------===//

				// Load Complement / Negative / Positive
				def : InstRW<[FPU], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[FPU], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Square root
				def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "SQXBR$")>;

				// Load FP integer
				def : InstRW<[FPU], (instregex "FIEBR(A)?$")>;
				def : InstRW<[FPU], (instregex "FIDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Binary arithmetic
				//===----------------------------------------------------------------------===//

				// Addition
				def : InstRW<[FPU, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "A(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[FPU, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "S(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[FPU, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[FPU], (instregex "M(D\|DE\|EE)BR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[FPU, LSU, Lat30], (instregex "D(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "D(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "DXBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Comparisons
				//===----------------------------------------------------------------------===//

				// Compare
				def : InstRW<[FPU, LSU, Lat12], (instregex "C(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "C(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>;

				// Test Data Class
				def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E\|D)B$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "TCXB$")>;

				}

lib/Target/SystemZ/SystemZScheduleZEC12.td

This file was added.

				//=- SystemZScheduleZEC12.td - SystemZ Scheduling Definitions --- tblgen --=//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for ZEC12 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def ZEC12Model : SchedMachineModel {

				let IssueWidth = 3; // 3 instructions decoded per cycle.
				let MicroOpBufferSize = 40; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;
				}

				let SchedModel = ZEC12Model in {

				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				let NumMicroOps = 0;
				let EndGroup = 1;
				}
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def ZEC12_VBUnit : ProcResource<1>;
				def ZEC12_FXUnit : ProcResource<1>;
				def ZEC12_LSUnit : ProcResource<1>;
				def ZEC12_FPUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXU, [ZEC12_FXUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [ZEC12_LSUnit]> { let Latency = 4; }
				def : WriteRes<LSU_lat1, [ZEC12_LSUnit]> { let Latency = 1; }
				def : WriteRes<FPU, [ZEC12_FPUnit]> { let Latency = 8; }
				def : WriteRes<VBU, [ZEC12_VBUnit]>; // Virtual Branching Unit

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				//===----------------------------------------------------------------------===//
				// Stack allocation
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY

				//===----------------------------------------------------------------------===//
				// Control flow instructions
				//===----------------------------------------------------------------------===//

				// Return
				def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>;
				def : InstRW<[LSU_lat1], (instregex "CondReturn$")>;

				// Compare and branch
				def : InstRW<[FXU], (instregex "(Asm.*)?C(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CG(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CL(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLG(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CG(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CLRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "(Asm.*)?CRB(Call\|Return)?$")>;

				// Branch
				def : InstRW<[LSU, Lat4], (instregex "(Asm.*)?BR$")>;
				def : InstRW<[LSU, Lat4], (instregex "(Asm)?BC(R)?$")>;
				def : InstRW<[VBU], (instregex "(Asm)?BRC(L)?$")>;
				def : InstRW<[FXU, EndGroup], (instregex "BRCT(G)?$")>;
				def : InstRW<[VBU], (instregex "(Asm.*)?JG$")>;
				def : InstRW<[VBU], (instregex "J$")>;
				// (Need to avoid conflict with "(Asm.*)?CG(I\|R)J$")
				def : InstRW<[VBU], (instregex "Asm(EAlt\|E\|HAlt\|HE\|H\|LAlt\|LE\|LH\|L\|NEAlt\|NE)J$")>;
				def : InstRW<[VBU], (instregex "Asm(NHAlt\|NHE\|NH\|NLAlt\|NLE\|NLH\|NL\|NO\|O)J$")>;
				def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Trap
				def : InstRW<[VBU], (instregex "(Cond)?Trap$")>;

				// Compare and trap
				def : InstRW<[FXU], (instregex "(Asm.*)?C(G)?IT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?C(G)?RT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLG(I\|R)T$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLFIT$")>;
				def : InstRW<[FXU], (instregex "(Asm.*)?CLRT$")>;

				//===----------------------------------------------------------------------===//
				// Select instructions
				//===----------------------------------------------------------------------===//

				// Select pseudo
				def : InstRW<[FXU], (instregex "Select(32\|64\|32Mux)$")>;

				// CondStore pseudos
				def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// Call instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[VBU, FXU, FXU, Lat3, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[LSU, Lat4], (instregex "CallB(C)?R$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[VBU], (instregex "CallBRCL$")>;
				def : InstRW<[VBU], (instregex "CallJG$")>;

				//===----------------------------------------------------------------------===//
				// Move instructions
				//===----------------------------------------------------------------------===//

				// Moves
				def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Pseudo -> reg move
				def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>;

				// Loads
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "L128$")>;

				def : InstRW<[FXU], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LLIL(F\|H\|L)$")>;

				def : InstRW<[FXU], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXU], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LR(Mux)?$")>;

				// Load and test
				def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G)?$")>;
				def : InstRW<[FXU], (instregex "LT(G)?R$")>;

				// Load on condition
				def : InstRW<[FXU, LSU, Lat6], (instregex "(Asm.*)?LOC(G)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "(Asm.*)?LOC(G)?R$")>;

				// Stores
				def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST128$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;

				// Store on condition
				def : InstRW<[FXU, LSU, Lat5], (instregex "(Asm.*)?STOC(G)?$")>;

				// String moves.
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;

				//===----------------------------------------------------------------------===//
				// Sign extensions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "L(B\|H\|G)R$")>;
				def : InstRW<[FXU], (instregex "LG(B\|H\|F)R$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LTGF$")>;
				def : InstRW<[FXU], (instregex "LTGFR$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H\|Mux\|RL)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(B\|H\|F)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(H\|F)RL$")>;

				//===----------------------------------------------------------------------===//
				// Zero extensions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLG(C\|H\|F)R$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLH(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[LSU], (instregex "LLHRL$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|H\|F\|HRL\|FRL)$")>;

				//===----------------------------------------------------------------------===//
				// Truncations
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;

				//===----------------------------------------------------------------------===//
				// Multi-register moves
				//===----------------------------------------------------------------------===//

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Store multiple (estimated average of 3 ops)
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone],
				(instregex "STM(H\|Y\|G)?$")>;

				//===----------------------------------------------------------------------===//
				// Byte swaps
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LRV(G)?R$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				//===----------------------------------------------------------------------===//
				// Load address instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "LA(Y\|RL)?$")>;

				// Load the Global Offset Table address
				def : InstRW<[FXU], (instregex "GOT$")>;

				//===----------------------------------------------------------------------===//
				// Absolute and Negation
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>;
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXU, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXU], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>;

				//===----------------------------------------------------------------------===//
				// Insertion
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXU], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "IILF(64)?$")>;
				def : InstRW<[FXU], (instregex "IILH(64)?$")>;
				def : InstRW<[FXU], (instregex "IILL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Addition
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y\|SI)?$")>;
				def : InstRW<[FXU, LSU, Lat6], (instregex "AH(Y)?$")>;
				def : InstRW<[FXU], (instregex "AIH$")>;
				def : InstRW<[FXU], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "AGFI$")>;
				def : InstRW<[FXU], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AGR(K)?$")>;
				def : InstRW<[FXU], (instregex "AHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXU], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXU], (instregex "ALGHSIK$")>;
				def : InstRW<[FXU], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXU], (instregex "ALR(K)?$")>;
				def : InstRW<[FXU], (instregex "AR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>;

				// Logical addition with carry
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "AGF$")>;
				def : InstRW<[FXU, Lat2], (instregex "AGFR$")>;

				//===----------------------------------------------------------------------===//
				// Subtraction
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXU, LSU, Lat6], (instregex "SH(Y)?$")>;
				def : InstRW<[FXU], (instregex "SGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLFI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLR(K)?$")>;
				def : InstRW<[FXU], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "SGF$")>;
				def : InstRW<[FXU, Lat2], (instregex "SGFR$")>;

				//===----------------------------------------------------------------------===//
				// AND
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "NGR(K)?$")>;
				def : InstRW<[FXU], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXU], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "NILF(64)?$")>;
				def : InstRW<[FXU], (instregex "NILH(64)?$")>;
				def : InstRW<[FXU], (instregex "NILL(64)?$")>;
				def : InstRW<[FXU], (instregex "NR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "NC$")>;

				//===----------------------------------------------------------------------===//
				// OR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "OGR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXU], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "OILF(64)?$")>;
				def : InstRW<[FXU], (instregex "OILH(64)?$")>;
				def : InstRW<[FXU], (instregex "OILL(64)?$")>;
				def : InstRW<[FXU], (instregex "OR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "OC$")>;

				//===----------------------------------------------------------------------===//
				// XOR
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXU], (instregex "XIFMux$")>;
				def : InstRW<[FXU], (instregex "XGR(K)?$")>;
				def : InstRW<[FXU], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "XILF(64)?$")>;
				def : InstRW<[FXU], (instregex "XR(K)?$")>;
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "XC$")>;

				//===----------------------------------------------------------------------===//
				// Multiplication
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXU, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXU, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXU, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXU, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXU, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Division and remainder
				//===----------------------------------------------------------------------===//

				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?$")>;
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?$")>;

				//===----------------------------------------------------------------------===//
				// Shifts
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXU], (instregex "RISBG(N\|32)?$")>;
				def : InstRW<[FXU], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N\|O\|X)SBG$")>;

				//===----------------------------------------------------------------------===//
				// Comparison
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXU], (instregex "C(G)?R$")>;
				def : InstRW<[FXU], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXU], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "CLGR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXU], (instregex "CLIH$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXU, LSU, Lat6], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXU, LSU, Lat6], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "CGFR$")>;

				// Compare logical character
				def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "CLC$")>;

				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				// Test under mask
				def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXU], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLL(64)?$")>;

				//===----------------------------------------------------------------------===//
				// Prefetch
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU], (instregex "PFD(RL)?$")>;

				//===----------------------------------------------------------------------===//
				// Atomic operations
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>;

				def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>;

				// Compare and swap
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				//===----------------------------------------------------------------------===//
				// Transactional execution
				//===----------------------------------------------------------------------===//

				// Transaction begin
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, FXU, FXU, Lat15, GroupAlone],
				(instregex "TBEGIN(C\|_nofloat)?$")>;

				// Transaction end
				def : InstRW<[LSU, GroupAlone], (instregex "TEND$")>;

				// Transaction abort
				def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>;

				// Extract Transaction Nesting Depth
				def : InstRW<[FXU], (instregex "ETND$")>;

				// Nontransactional store
				def : InstRW<[FXU, LSU, Lat5], (instregex "NTSTG$")>;

				//===----------------------------------------------------------------------===//
				// Processor assist
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "PPA$")>;

				//===----------------------------------------------------------------------===//
				// Miscellaneous Instructions.
				//===----------------------------------------------------------------------===//

				// Insert Program Mask
				def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Find leftmost one
				def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>;

				// Extend
				def : InstRW<[FXU], (instregex "AEXT128_64$")>;
				def : InstRW<[FXU], (instregex "ZEXT128_(32\|64)$")>;

				// String instructions
				def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>;

				// Move with key
				def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Extract CPU Time
				def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXU, Lat30], (instregex "PR$")>;

				// Inline assembly
				def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "STCK(F)?$")>;
				def : InstRW<[LSU, LSU, LSU, LSU, FXU, FXU, Lat20, GroupAlone],
				(instregex "STCKE$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXU, Lat30], (instregex "SVC$")>;

				// Store real address
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>;

				//===----------------------------------------------------------------------===//
				// .insn directive instructions
				//===----------------------------------------------------------------------===//

				// An "empty" sched-class will be assigned instead of the "invalid sched-class".
				// getNumDecoderSlots() will then return 1 instead of 0.
				def : InstRW<[], (instregex "Insn.*")>;


				// ----------------------------- Floating point ----------------------------- //

				//===----------------------------------------------------------------------===//
				// FP: Select instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU], (instregex "SelectF(32\|64\|128)$")>;
				def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Move instructions
				//===----------------------------------------------------------------------===//

				// Load zero
				def : InstRW<[FXU], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>;

				// Load
				def : InstRW<[FXU], (instregex "LER$")>;
				def : InstRW<[FXU], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXU, Lat3], (instregex "LGDR$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load and Test
				def : InstRW<[FPU], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Copy sign
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d\|s)$")>;

				//===----------------------------------------------------------------------===//
				// FP: Load instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[LSU], (instregex "LE(Y)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Store instructions
				//===----------------------------------------------------------------------===//

				def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STX$")>;

				//===----------------------------------------------------------------------===//
				// FP: Conversion instructions
				//===----------------------------------------------------------------------===//

				// Load rounded
				def : InstRW<[FPU], (instregex "LEDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>;

				// Load lengthened
				def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[FPU], (instregex "LDEBR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F\|G)XBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Unary arithmetic
				//===----------------------------------------------------------------------===//

				// Load Complement / Negative / Positive
				def : InstRW<[FPU], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[FPU], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Square root
				def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "SQXBR$")>;

				// Load FP integer
				def : InstRW<[FPU], (instregex "FIEBR(A)?$")>;
				def : InstRW<[FPU], (instregex "FIDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>;

				//===----------------------------------------------------------------------===//
				// FP: Binary arithmetic
				//===----------------------------------------------------------------------===//

				// Addition
				def : InstRW<[FPU, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "A(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[FPU, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "S(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[FPU, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[FPU], (instregex "M(D\|DE\|EE)BR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[FPU, LSU, Lat30], (instregex "D(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "D(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "DXBR$")>;

				//===----------------------------------------------------------------------===//
				// FP: Comparisons
				//===----------------------------------------------------------------------===//

				// Compare
				def : InstRW<[FPU, LSU, Lat12], (instregex "C(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "C(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>;

				// Test Data Class
				def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E\|D)B$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "TCXB$")>;

				}

lib/Target/SystemZ/SystemZTargetMachine.cpp

//===-- SystemZTargetMachine.cpp - Define TargetMachine for SystemZ -------===//		//===-- SystemZTargetMachine.cpp - Define TargetMachine for SystemZ -------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZTargetMachine.h"		#include "SystemZTargetMachine.h"
#include "SystemZTargetTransformInfo.h"		#include "SystemZTargetTransformInfo.h"
		#include "SystemZMachineScheduler.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/CodeGen/TargetPassConfig.h"		#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"		#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"

using namespace llvm;		using namespace llvm;

extern cl::opt<bool> MISchedPostRA;
extern "C" void LLVMInitializeSystemZTarget() {		extern "C" void LLVMInitializeSystemZTarget() {
// Register the target.		// Register the target.
RegisterTargetMachine<SystemZTargetMachine> X(getTheSystemZTarget());		RegisterTargetMachine<SystemZTargetMachine> X(getTheSystemZTarget());
}		}

// Determine whether we use the vector ABI.		// Determine whether we use the vector ABI.
static bool UsesVectorABI(StringRef CPU, StringRef FS) {		static bool UsesVectorABI(StringRef CPU, StringRef FS) {
// We use the vector ABI whenever the vector facility is avaiable.		// We use the vector ABI whenever the vector facility is avaiable.
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
public:		public:
SystemZPassConfig(SystemZTargetMachine *TM, PassManagerBase &PM)		SystemZPassConfig(SystemZTargetMachine *TM, PassManagerBase &PM)
: TargetPassConfig(TM, PM) {}		: TargetPassConfig(TM, PM) {}

SystemZTargetMachine &getSystemZTargetMachine() const {		SystemZTargetMachine &getSystemZTargetMachine() const {
return getTM<SystemZTargetMachine>();		return getTM<SystemZTargetMachine>();
}		}

		ScheduleDAGInstrs *
		createPostMachineScheduler(MachineSchedContext *C) const override {
		return new ScheduleDAGMI(C, make_unique<SystemZPostRASchedStrategy>(C),
		/IsPostRA=/true);
		}

void addIRPasses() override;		void addIRPasses() override;
bool addInstSelector() override;		bool addInstSelector() override;
void addPreSched2() override;		void addPreSched2() override;
void addPreEmitPass() override;		void addPreEmitPass() override;
};		};
} // end anonymous namespace		} // end anonymous namespace

void SystemZPassConfig::addIRPasses() {		void SystemZPassConfig::addIRPasses() {
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	void SystemZPassConfig::addPreEmitPass() {
// preventing that would be a win or not.		// preventing that would be a win or not.
if (getOptLevel() != CodeGenOpt::None)		if (getOptLevel() != CodeGenOpt::None)
addPass(createSystemZElimComparePass(getSystemZTargetMachine()), false);		addPass(createSystemZElimComparePass(getSystemZTargetMachine()), false);
addPass(createSystemZLongBranchPass(getSystemZTargetMachine()));		addPass(createSystemZLongBranchPass(getSystemZTargetMachine()));

// Do final scheduling after all other optimizations, to get an		// Do final scheduling after all other optimizations, to get an
// optimal input for the decoder (branch relaxation must happen		// optimal input for the decoder (branch relaxation must happen
// after block placement).		// after block placement).
if (getOptLevel() != CodeGenOpt::None) {		if (getOptLevel() != CodeGenOpt::None)
if (MISchedPostRA)
addPass(&PostMachineSchedulerID);		addPass(&PostMachineSchedulerID);
else
addPass(&PostRASchedulerID);
}
}		}

TargetPassConfig *SystemZTargetMachine::createPassConfig(PassManagerBase &PM) {		TargetPassConfig *SystemZTargetMachine::createPassConfig(PassManagerBase &PM) {
return new SystemZPassConfig(this, PM);		return new SystemZPassConfig(this, PM);
}		}

TargetIRAnalysis SystemZTargetMachine::getTargetIRAnalysis() {		TargetIRAnalysis SystemZTargetMachine::getTargetIRAnalysis() {
return TargetIRAnalysis([this](const Function &F) {		return TargetIRAnalysis([this](const Function &F) {
return TargetTransformInfo(SystemZTTIImpl(this, F));		return TargetTransformInfo(SystemZTTIImpl(this, F));
});		});
}		}

test/CodeGen/SystemZ/vec-args-06.ll

	Show All 36 Lines

	; More than eight vector return values use sret.			; More than eight vector return values use sret.
	define { <2 x double>, <2 x double>, <2 x double>, <2 x double>,			define { <2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double>, <2 x double>, <2 x double>, <2 x double>,			<2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double> } @f2() {			<2 x double> } @f2() {
	; CHECK-LABEL: f2:			; CHECK-LABEL: f2:
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 128(%r2)			; CHECK-DAG: vst [[VTMP]], 128(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 112(%r2)			; CHECK-DAG: vst [[VTMP]], 112(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 96(%r2)			; CHECK-DAG: vst [[VTMP]], 96(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 80(%r2)			; CHECK-DAG: vst [[VTMP]], 80(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 64(%r2)			; CHECK-DAG: vst [[VTMP]], 64(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 48(%r2)			; CHECK-DAG: vst [[VTMP]], 48(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 32(%r2)			; CHECK-DAG: vst [[VTMP]], 32(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 16(%r2)			; CHECK-DAG: vst [[VTMP]], 16(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 0(%r2)			; CHECK: vst [[VTMP]], 0(%r2)
	; CHECK: br %r14			; CHECK: br %r14
	ret { <2 x double>, <2 x double>, <2 x double>, <2 x double>,			ret { <2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double>, <2 x double>, <2 x double>, <2 x double>,			<2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double> }			<2 x double> }
	{ <2 x double> <double 1.0, double 1.1>,			{ <2 x double> <double 1.0, double 1.1>,
	<2 x double> <double 2.0, double 2.1>,			<2 x double> <double 2.0, double 2.1>,
	<2 x double> <double 3.0, double 3.1>,			<2 x double> <double 3.0, double 3.1>,
	<2 x double> <double 4.0, double 4.1>,			<2 x double> <double 4.0, double 4.1>,
	<2 x double> <double 5.0, double 5.1>,			<2 x double> <double 5.0, double 5.1>,
	<2 x double> <double 6.0, double 6.1>,			<2 x double> <double 6.0, double 6.1>,
	<2 x double> <double 7.0, double 7.1>,			<2 x double> <double 7.0, double 7.1>,
	<2 x double> <double 8.0, double 8.1>,			<2 x double> <double 8.0, double 8.1>,
	<2 x double> <double 9.0, double 9.1> }			<2 x double> <double 9.0, double 9.1> }
	}			}

test/CodeGen/SystemZ/vec-perm-12.ll

	; Test inserting a truncated value into a vector element			; Test inserting a truncated value into a vector element
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \
	; RUN: FileCheck -check-prefix=CHECK-CODE %s			; RUN: FileCheck -check-prefix=CHECK-CODE %s
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \
	; RUN: FileCheck -check-prefix=CHECK-VECTOR %s			; RUN: FileCheck -check-prefix=CHECK-VECTOR %s

	define <4 x i32> @f1(<4 x i32> %x, i64 %y) {			define <4 x i32> @f1(<4 x i32> %x, i64 %y) {
	; CHECK-CODE-LABEL: f1:			; CHECK-CODE-LABEL: f1:
	; CHECK-CODE: vlvgf [[ELT:%v[0-9]+]], %r2, 0			; CHECK-CODE-DAG: vlvgf [[ELT:%v[0-9]+]], %r2, 0
	; CHECK-CODE: larl [[REG:%r[0-5]]],			; CHECK-CODE-DAG: larl [[REG:%r[0-5]]],
	; CHECK-CODE: vl [[MASK:%v[0-9]+]], 0([[REG]])			; CHECK-CODE-DAG: vl [[MASK:%v[0-9]+]], 0([[REG]])
	; CHECK-CODE: vperm %v24, %v24, [[ELT]], [[MASK]]			; CHECK-CODE: vperm %v24, %v24, [[ELT]], [[MASK]]
	; CHECK-CODE: br %r14			; CHECK-CODE: br %r14

	; CHECK-VECTOR: .byte 12			; CHECK-VECTOR: .byte 12
	; CHECK-VECTOR-NEXT: .byte 13			; CHECK-VECTOR-NEXT: .byte 13
	; CHECK-VECTOR-NEXT: .byte 14			; CHECK-VECTOR-NEXT: .byte 14
	; CHECK-VECTOR-NEXT: .byte 15			; CHECK-VECTOR-NEXT: .byte 15
	; CHECK-VECTOR-NEXT: .byte 8			; CHECK-VECTOR-NEXT: .byte 8
	Show All 23 Lines