This is an archive of the discontinued LLVM Phabricator instance.

SystemZ scheduling implementation
ClosedPublic

Authored by jonpa on Feb 15 2016, 2:31 AM.

Download Raw Diff

Details

Reviewers

uweigand
atrick
hfinkel

Summary

General review and comments / suggestions would be greatly appreciated for this implementation of instruction scheduling for SystemZ.

It contains some changes outside the SystemZ backend, which will be addressed in separate revisions, but are included here also since they are part of this project.

There are some experimental parts left, which will not be commited, such as statistics and options.

Main points are:

post-ra mischeduling with a HazardRecognizer for decoder groups, and a custom SchedStrategy.
pre-ra mischeduling enabled.
Instruction scheduling classes and definitions for z13, zEC12 and z196. z10 scheudling is not supported.

Diff Detail

Event Timeline

jonpa updated this revision to Diff 47963.Feb 15 2016, 2:31 AM

jonpa retitled this revision from to SystemZ scheduling implementation.

jonpa updated this object.

jonpa added reviewers: atrick, hfinkel.

jonpa added a subscriber: llvm-commits.

Herald added subscribers: qcolombet, MatzeB. · View Herald TranscriptFeb 15 2016, 2:31 AM

A minor change to make Release builds succeed.

Overall nice job. I won't be able to review your hazard recognizer or any of the SystemZ models.

lib/Target/SystemZ/SystemZISelLowering.cpp
123–125 ↗	(On Diff #47989)	You won't be able to rely on this since SelectionDAG scheduler is deprecated. It's just waiting for a replacement. I think you should focus on the right MI scheduler heuristics for your target. That said, I can see why you did this because your MI scheduler is top-down only so it might be hard for you to control register pressure. As an alternative, we could easily support a two-pass MI scheuler, bottom-up, then top-down.
lib/Target/SystemZ/SystemZInstrInfo.h
171–173	FYI, the "new" machine model is meant to be flexible enough that you don't need to create your own hazard recognizer (you can add predicates and arbitrary pseudo machine resources). However, it's tricky to do that and fine just to use a hazard recognizer when you have complicated decode/issue group constraints.
lib/Target/SystemZ/SystemZScheduleZ13.td
44	In-order scheduling with multiple functional units of the same type is somewhat broken in the generic scheduler. ReservedCycles is only tracking a worst-cast resource availability across all units. It should really be a two-dimensional array. I know this has been fixed out-of-tree at least one target but never pushed back. It would be great if you fixed that!

This revision is now accepted and ready to land.Feb 15 2016, 11:22 AM

jonpa added inline comments.Feb 16 2016, 6:08 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–125 ↗	(On Diff #47989)	I would hope that the pre-ra MI scheduler is bidirectional, due to the overrideSchedPolicy() call, where this is selected. Is there a better /normal way of doing this, perhaps?
lib/Target/SystemZ/SystemZScheduleZ13.td
44	This unit is handled as a special case by the HazardRecognizer, I guess partly because I couldn't see that the code - as you say - did what I wanted. That would be interesting to fix... could perhaps this out-of-tree target push this possibly?

Latencies of z13 vector instructions corrected.

One more test case updated.

jonpa added a reviewer: uweigand.Feb 17 2016, 5:18 AM

atrick added inline comments.Feb 17 2016, 9:44 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	Right, I was looking at your PostRA policy, which is rightly top-down. That said, you might still achieve better register pressure results by a two-pass scheduling approach. I tried hard to wedge all heuristics into a single pass because I was paranoid about compile time. Ultimately it's whatever works for your target, I was just pointing out that SelectionDAG is a bad place for scheduler heuristics We could support a multiple-pass MI scheduler if anyone needs it
lib/Target/SystemZ/SystemZScheduleZ13.td
45	It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree code. Out-of-tree work usually leads to problems. On the other hand, making it easy to write custom, possibly out-of-tree schedulers was a major goal of MI scheduling.

jonpa added inline comments.Feb 17 2016, 11:54 PM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	I am curious as to what you think would be the possibilities of a multi-pass scheduling approach. I have tried this once before like: First do a minimal reg-pressure scheduling. Then increase parallelism (overlapping live intervals) only when it seems to not cause too much spilling. Is this also what you had in mind? Currently this is not needed for SystemZ, as the main focus at least right now is on JIT compilation.

atrick added inline comments.Feb 18 2016, 8:27 AM

lib/Target/SystemZ/SystemZISelLowering.cpp
123–138 ↗	(On Diff #48068)	Yes that's what I had in mind. I didn't take that approach because I wanted to preserve source order in common cases where an out-of-order target has enough registers, as well as compile time.

Latencies corrected, mainly for z13 vector instructions.

Improved modelling of execution units by separating execution units and decoder slots needed for each instruction. Instructions with a double use of exec unit or with a coupled use of the LSU now gets this modelled.

For z13, more of the pipelines have been modelled (FXa, FXb, and the various vector pipelines).

LSU latency corrected to 4.
Latencies properly summed for cracked / expanded instructions. Instructions with joined dispatch have a latency same as for single issue
type instructions.

Tried use an include of commmon defs for different SchedModels, but TableGen
rejected this -- see comment in for example top screen of SystemZScheduleZ13.td.

This patch is on its way, but some regressions need to be fixed first.

jonpa mentioned this in D24451: [LoopUnroller] Replace UnrollingPreferences::Force with ForceMaxCount + SystemZ getUnrollingPreferences()..Sep 20 2016, 4:10 AM

This was already approved before, but that was quite some time ago, so I now reopen this review since my patch then never passed the performance measurements.

This is now *post-RA scheduling only* for SystemZ. Nearly all of the patch belongs to the SystemZ backend.

Herald added subscribers: modocache, mgorny, beanz. · View Herald TranscriptOct 7 2016, 5:01 AM

NFC update per Uli's requests. No longer any common code changes.

uweigand added inline comments.Oct 16 2016, 8:37 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
45	I guess instead of this we could simply use getInstrInfo on the SchedModel.
85	Is there a reason why this is a separate function and no just done directly in ::Reset()?
155	This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name?
lib/Target/SystemZ/SystemZHazardRecognizer.h
121	Why does groupingCost use the return value while resourcesCost uses an output parameter?
lib/Target/SystemZ/SystemZMachineScheduler.h
58	As discussed offline, we really need to get rid of the gobal/static variable here. I note that there is quite a bit of similarily between this "preliminary" sorter and the final sorter in Candidate. Maybe we should actually store "preliminary" candidates in the Available list (with GroupingCost and ResourceCost only set to 0 or 1 depending on whether grouping or reserved resources are involved), and the update the cost parameters with actual values once we known them? We then might even be able to reuse the same comparison routine ...
lib/Target/SystemZ/SystemZScheduleZ13.td
28	We should really try to get this complete, so that instructions added in the future don't accidentally lack scheduling information. How difficult would it be to get there?
104	Ideally, the ordering in these files should mostly correspond to the ordering in the original InstrInfo files, just to make them easier to find ...
105	Also, it is somewhat annoying that we need to list not just the basic instruction definitions, but all the various aliases as well. I'm wondering if there isn't some what to annotate the Alias definition in the main file with the opcode the alias will be resolved in the end, so that can be used for scheduling purposes ...
225	This is not an "And", it's a non-transactional store and should go with the transaction-related instructions.
582	It would be nice to at least separate out vector floating-point instructions, so we can easily see where W variants are needed.
739	I don't think there's a real difference between those and the ones listed under Other.
764	This is just an alias for a LARL and should go there.

Updated per requests.

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

For the rest, please see replies to comments below.

jonpa requested a review of this revision.Oct 18 2016, 5:46 AM

jonpa edited edge metadata.

jonpa added inline comments.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
155	Fixed by using string::substr()/resize() instead. Now all units should really be named per a Z13_XXXUnit pattern.
lib/Target/SystemZ/SystemZMachineScheduler.h
58	I gave this a try, but it thought it was a bit messy to flip the sorting variables back and forth inside a set of Candidates. I instead use the isScheduleHigh flag for groupers / FPd ops, which simplifies the SUSorter method while also eliminating the static variable. The iteration in pickNode() should be nearly unaffected, since these nodes are quite rare.
lib/Target/SystemZ/SystemZScheduleZ13.td
28	I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set CompleteModel (the reason I did not do that before is that AFAICR, this flag then also demanded modeling of operand writes or something of that sort). Worthy of mentioning is that this triggers an error during build for any instruction missing scheduling input for all subtargets. In a debug build, TargetSchedule.cpp will then cause an abort during compilation if the subtarget does not have scheduling input for an emitted instruction.
104	I have reorganized the files completely to match the InstrInfo file sections.
105	Could not as of yet find any way to achieve this, but there might be some way...

jonpa added inline comments.Oct 18 2016, 5:50 AM

lib/Target/SystemZ/SystemZHazardRecognizer.h
121	During experiments I have also returned a cost here for the 'other' processor side. I guess I could change that back now until it's needed again...?

In D17260#572785, @jonpa wrote:

Updated per requests.

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

For the rest, please see replies to comments below.

Looking quite good now, thanks! The one thing I'm still wondering about is the hasNoSchedulingInfo flags. Those are of course fine for the .insn directive, and also for custom-inserter pseudos. However, I don't think we want them for the Asm* branch variants; those are just regular branch instructions that might be used in inline assembler, and they really should have the same scheduling info as the standard form of the branch instructions. I think it should be straightforward to implement this by adding "(Asm)?" to the instregex strings for the branch instructions.

lib/Target/SystemZ/SystemZMachineScheduler.h
58	OK, that makes sense.
lib/Target/SystemZ/SystemZScheduleZ13.td
27	I think this is the default setting, so we can just leave it out here.
28	Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in the not-too-distant future we'll have completed the instruction set for the old processors anyway ...
104	Excellent, it is much more readable now!
105	Oh well, if there's no straightforward way, it's OK the way it is now ...
lib/Target/SystemZ/SystemZScheduleZ196.td
68	What's the EC12 doing here?

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

In D17260#573108, @atrick wrote:

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

This is not about InstAlias records (which are aliases for the assembler) -- those indeed work fine. The issue here was about aliases for the code generator. We use those usually because some instructions need operands on the MC level that we don't want to expose on the MI level. For example, a "return" instruction on SystemZ is actually a "br %r14", but at the MI level this doesn't have any operands, but is simply defined as an alias:

def Return : Alias<2, (outs), (ins), [(z_retflag)]>;

where Alias is formally an Instruction, but doesn't have any information about mnemonic or opcodes. Instead, when this alias is about to be emitted, we emit an actual BR instruction pattern, adding the R14 operand at this point.

This means that to get a complete scheduler model, we have to duplicate the scheduling info for BR also for Return. It would be nice if there were instead a way to say in the definition of Return to just look at BR for scheduling info.

In D17260#573123, @uweigand wrote:
In D17260#573108, @atrick wrote:

In D17260#572785, @jonpa wrote:

Does anyone know how to say "Instruction B should have the same scheduling class as instruction A" ? This was the only point I could not get fixed.

I'm not sure I understand the problem. I *think* you can mark your scheduling model "complete", then add InstAlias records in your .td file without adding new InstRW records...

This is not about InstAlias records (which are aliases for the assembler) -- those indeed work fine. The issue here was about aliases for the code generator. We use those usually because some instructions need operands on the MC level that we don't want to expose on the MI level. For example, a "return" instruction on SystemZ is actually a "br %r14", but at the MI level this doesn't have any operands, but is simply defined as an alias:
def Return : Alias<2, (outs), (ins), [(z_retflag)]>;
where Alias is formally an Instruction, but doesn't have any information about mnemonic or opcodes. Instead, when this alias is about to be emitted, we emit an actual BR instruction pattern, adding the R14 operand at this point.

This means that to get a complete scheduler model, we have to duplicate the scheduling info for BR also for Return. It would be nice if there were instead a way to say in the definition of Return to just look at BR for scheduling info.

The scheduling class is on the MCInstrDesc. AFAIK, there's no abstract way to tie your aliasing pseudo instruction's MCInstrDesc to their lowered instruction's MCInstrDesc. You can try to factor instruction records in the InstrInfo.td file itself by using a SchedRW field, but that's not the way you've structured things. I think the only reasonable thing to do here is duplicate the InstRW records. You've basically told CodeGen that these are two distinct MC instrs.

In D17260#573190, @atrick wrote:

The scheduling class is on the MCInstrDesc. AFAIK, there's no abstract way to tie your aliasing pseudo instruction's MCInstrDesc to their lowered instruction's MCInstrDesc. You can try to factor instruction records in the InstrInfo.td file itself by using a SchedRW field, but that's not the way you've structured things. I think the only reasonable thing to do here is duplicate the InstRW records. You've basically told CodeGen that these are two distinct MC instrs.

Yes, it look like this is what we'll have to do. But I guess that's fine, we don't have all that many of those aliases anyway.

Updated:
Asm* instructions now also get a useful schedclass.
getNumDecoderSlots() fixed to handle generic opcodes correctly.
resourcesCost() returns an int, just like groupingCost()
'CompleteModel = 1' removed

What are your thoughts on asserting for target instructions' sched class desc if CompleteModel is true? (see comment below)

jonpa added inline comments.Oct 19 2016, 4:39 AM

lib/Target/SystemZ/SystemZScheduleZ13.td
27	aah, right.
28	I found that my previous statement on the compile time checking for scheduling input was not actually correct - this does not really cover all instructions: Currently asserts trigger only if All subtargets have omitted the instruction from Schedule .td files. TableGen catches this, and only this because it isn't clever enough to know if a given subtarget (with a missing sched class for an instruction), actually supports that instruction or not. An instruction has a def operand and the sched class does not have a WriteLatency entry for it. This is a specialized assert (in computeOperandLatency()) which doesn't cover all instructions - I could for instance remove the InstRW for a compare and not get any assert triggering. So there really isn't any general assert that checks that for a subtarget with a CompleteModel, all instructions actually emitted have a sane scheduling class. What happened if I removed the InstRW for an instruction, was that it got its own (valid) schedclass for the subtarget, but with just 0 values. I think we could catch the error of forgetting a subtarget/instruction sched annotation with an assert in the scheduler that demands at least one uop for any target instruction. This has at least worked well during my experiments previously. Should I add this just in the SystemZ backend? Or could it be part of the common code somewhere? (I am thinking this should be done both pre-ra and post-ra). Or is there any reason not to demand this that I have missed?
lib/Target/SystemZ/SystemZScheduleZ196.td
68	Good heavens!

uweigand added inline comments.Oct 19 2016, 6:21 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
48	This looks OK to me. However, with this change we should now give a scheduling class to the DirectiveInsn pesudo instruction -- these do emit some instruction, we just don't know which one, so it should probably be modeled as some "generic" instruction.
lib/Target/SystemZ/SystemZHazardRecognizer.h
113	Seems you forgot to update the comment when changing the code :-)
lib/Target/SystemZ/SystemZScheduleZ13.td
28	I don't think doing it in the backend is the right place. In the backend, you only see the instructions that the code being compiled happens to use; and when you do find an error there, there's not much you can do. The right place does seem to be TableGen. And in fact, I had interpreted the code in CodeGenSchedModels::checkCompleteness to do just that check. If this doesn't work as expected, it probably ought to be fixed there. But in any case this is a separate problem and shouldn't hold up this patch.

Updated with an empty sched class for Insn.. instructions, so that getNumDecoderSlots() will return 1 and not 0.

jonpa added inline comments.Oct 19 2016, 6:44 AM

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
48	I used an empty InstRW construct, which seems to do the job.

OK, this looks good to me now. Thanks!

This revision is now accepted and ready to land.Oct 19 2016, 8:22 AM

Commited as r284704.

uweigand mentioned this in D26156: Fix per-processor model scheduler definition completeness check.Oct 31 2016, 9:50 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

ScheduleDAG.h

7 lines

lib/

CodeGen/

ScheduleDAGInstrs.cpp

4 lines

Target/

SystemZ/

CMakeLists.txt

2 lines

SystemZ.td

5 lines

SystemZHazardRecognizer.h

146 lines

SystemZHazardRecognizer.cpp

394 lines

SystemZInstrInfo.h

8 lines

SystemZInstrInfo.cpp

35 lines

SystemZMachineScheduler.h

113 lines

SystemZMachineScheduler.cpp

144 lines

SystemZProcessors.td

7 lines

SystemZSchedule.td

70 lines

SystemZScheduleZ13.td

787 lines

SystemZScheduleZ196.td

579 lines

SystemZScheduleZEC12.td

598 lines

SystemZTargetMachine.cpp

15 lines

test/

CodeGen/

SystemZ/

vec-args-06.ll

32 lines

vec-perm-12.ll

6 lines

Diff 73919

include/llvm/CodeGen/ScheduleDAG.h

Show First 20 Lines • Show All 283 Lines • ▼ Show 20 Lines	public:
bool isPending : 1; // True once pending.		bool isPending : 1; // True once pending.
bool isAvailable : 1; // True once available.		bool isAvailable : 1; // True once available.
bool isScheduled : 1; // True once scheduled.		bool isScheduled : 1; // True once scheduled.
bool isScheduleHigh : 1; // True if preferable to schedule high.		bool isScheduleHigh : 1; // True if preferable to schedule high.
bool isScheduleLow : 1; // True if preferable to schedule low.		bool isScheduleLow : 1; // True if preferable to schedule low.
bool isCloned : 1; // True if this node has been cloned.		bool isCloned : 1; // True if this node has been cloned.
bool isUnbuffered : 1; // Uses an unbuffered resource.		bool isUnbuffered : 1; // Uses an unbuffered resource.
bool hasReservedResource : 1; // Uses a reserved resource.		bool hasReservedResource : 1; // Uses a reserved resource.
		bool affectsGrouping : 1; // Begins or ends decoder group.
Sched::Preference SchedulingPref; // Scheduling preference.		Sched::Preference SchedulingPref; // Scheduling preference.

private:		private:
bool isDepthCurrent : 1; // True if Depth is current.		bool isDepthCurrent : 1; // True if Depth is current.
bool isHeightCurrent : 1; // True if Height is current.		bool isHeightCurrent : 1; // True if Height is current.
unsigned Depth; // Node depth.		unsigned Depth; // Node depth.
unsigned Height; // Node height.		unsigned Height; // Node height.
public:		public:
Show All 9 Lines	SUnit(SDNode *node, unsigned nodenum)
: Node(node), Instr(nullptr), OrigNode(nullptr), SchedClass(nullptr),		: Node(node), Instr(nullptr), OrigNode(nullptr), SchedClass(nullptr),
NodeNum(nodenum), NodeQueueId(0), NumPreds(0), NumSuccs(0),		NodeNum(nodenum), NodeQueueId(0), NumPreds(0), NumSuccs(0),
NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),		NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),
NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),		NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),
isCallOp(false), isTwoAddress(false), isCommutable(false),		isCallOp(false), isTwoAddress(false), isCommutable(false),
hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),		hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),
isPending(false), isAvailable(false), isScheduled(false),		isPending(false), isAvailable(false), isScheduled(false),
isScheduleHigh(false), isScheduleLow(false), isCloned(false),		isScheduleHigh(false), isScheduleLow(false), isCloned(false),
isUnbuffered(false), hasReservedResource(false),		isUnbuffered(false), hasReservedResource(false), affectsGrouping(false),
SchedulingPref(Sched::None), isDepthCurrent(false),		SchedulingPref(Sched::None), isDepthCurrent(false),
isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),		isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),
BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}		BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}

/// SUnit - Construct an SUnit for post-regalloc scheduling to represent		/// SUnit - Construct an SUnit for post-regalloc scheduling to represent
/// a MachineInstr.		/// a MachineInstr.
SUnit(MachineInstr *instr, unsigned nodenum)		SUnit(MachineInstr *instr, unsigned nodenum)
: Node(nullptr), Instr(instr), OrigNode(nullptr), SchedClass(nullptr),		: Node(nullptr), Instr(instr), OrigNode(nullptr), SchedClass(nullptr),
NodeNum(nodenum), NodeQueueId(0), NumPreds(0), NumSuccs(0),		NodeNum(nodenum), NodeQueueId(0), NumPreds(0), NumSuccs(0),
NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),		NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),
NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),		NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),
isCallOp(false), isTwoAddress(false), isCommutable(false),		isCallOp(false), isTwoAddress(false), isCommutable(false),
hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),		hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),
isPending(false), isAvailable(false), isScheduled(false),		isPending(false), isAvailable(false), isScheduled(false),
isScheduleHigh(false), isScheduleLow(false), isCloned(false),		isScheduleHigh(false), isScheduleLow(false), isCloned(false),
isUnbuffered(false), hasReservedResource(false),		isUnbuffered(false), hasReservedResource(false), affectsGrouping(false),
SchedulingPref(Sched::None), isDepthCurrent(false),		SchedulingPref(Sched::None), isDepthCurrent(false),
isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),		isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),
BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}		BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}

/// SUnit - Construct a placeholder SUnit.		/// SUnit - Construct a placeholder SUnit.
SUnit()		SUnit()
: Node(nullptr), Instr(nullptr), OrigNode(nullptr), SchedClass(nullptr),		: Node(nullptr), Instr(nullptr), OrigNode(nullptr), SchedClass(nullptr),
NodeNum(BoundaryID), NodeQueueId(0), NumPreds(0), NumSuccs(0),		NodeNum(BoundaryID), NodeQueueId(0), NumPreds(0), NumSuccs(0),
NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),		NumPredsLeft(0), NumSuccsLeft(0), WeakPredsLeft(0), WeakSuccsLeft(0),
NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),		NumRegDefsLeft(0), Latency(0), isVRegCycle(false), isCall(false),
isCallOp(false), isTwoAddress(false), isCommutable(false),		isCallOp(false), isTwoAddress(false), isCommutable(false),
hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),		hasPhysRegUses(false), hasPhysRegDefs(false), hasPhysRegClobbers(false),
isPending(false), isAvailable(false), isScheduled(false),		isPending(false), isAvailable(false), isScheduled(false),
isScheduleHigh(false), isScheduleLow(false), isCloned(false),		isScheduleHigh(false), isScheduleLow(false), isCloned(false),
isUnbuffered(false), hasReservedResource(false),		isUnbuffered(false), hasReservedResource(false), affectsGrouping(false),
SchedulingPref(Sched::None), isDepthCurrent(false),		SchedulingPref(Sched::None), isDepthCurrent(false),
isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),		isHeightCurrent(false), Depth(0), Height(0), TopReadyCycle(0),
BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}		BotReadyCycle(0), CopyDstRC(nullptr), CopySrcRC(nullptr) {}

/// \brief Boundary nodes are placeholders for the boundary of the		/// \brief Boundary nodes are placeholders for the boundary of the
/// scheduling region.		/// scheduling region.
///		///
/// BoundaryNodes can have DAG edges, including Data edges, but they do not		/// BoundaryNodes can have DAG edges, including Data edges, but they do not
▲ Show 20 Lines • Show All 415 Lines • Show Last 20 Lines

lib/CodeGen/ScheduleDAGInstrs.cpp

Show First 20 Lines • Show All 673 Lines • ▼ Show 20 Lines	if (SchedModel.hasInstrSchedModel()) {
break;		break;
case 1:		case 1:
SU->isUnbuffered = true;		SU->isUnbuffered = true;
break;		break;
default:		default:
break;		break;
}		}
}		}

		// Set a flag on SU if it must begin or end the decoder group.
		if (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup))
		SU->affectsGrouping = true;
}		}
}		}
}		}

void ScheduleDAGInstrs::collectVRegUses(SUnit *SU) {		void ScheduleDAGInstrs::collectVRegUses(SUnit *SU) {
const MachineInstr *MI = SU->getInstr();		const MachineInstr *MI = SU->getInstr();
for (const MachineOperand &MO : MI->operands()) {		for (const MachineOperand &MO : MI->operands()) {
if (!MO.isReg())		if (!MO.isReg())
▲ Show 20 Lines • Show All 1,012 Lines • Show Last 20 Lines

lib/Target/SystemZ/CMakeLists.txt

	Show All 11 Lines
	add_public_tablegen_target(SystemZCommonTableGen)			add_public_tablegen_target(SystemZCommonTableGen)

	add_llvm_target(SystemZCodeGen			add_llvm_target(SystemZCodeGen
	SystemZAsmPrinter.cpp			SystemZAsmPrinter.cpp
	SystemZCallingConv.cpp			SystemZCallingConv.cpp
	SystemZConstantPoolValue.cpp			SystemZConstantPoolValue.cpp
	SystemZElimCompare.cpp			SystemZElimCompare.cpp
	SystemZFrameLowering.cpp			SystemZFrameLowering.cpp
				SystemZHazardRecognizer.cpp
	SystemZISelDAGToDAG.cpp			SystemZISelDAGToDAG.cpp
	SystemZISelLowering.cpp			SystemZISelLowering.cpp
	SystemZInstrInfo.cpp			SystemZInstrInfo.cpp
	SystemZLDCleanup.cpp			SystemZLDCleanup.cpp
	SystemZLongBranch.cpp			SystemZLongBranch.cpp
	SystemZMachineFunctionInfo.cpp			SystemZMachineFunctionInfo.cpp
				SystemZMachineScheduler.cpp
	SystemZMCInstLower.cpp			SystemZMCInstLower.cpp
	SystemZRegisterInfo.cpp			SystemZRegisterInfo.cpp
	SystemZSelectionDAGInfo.cpp			SystemZSelectionDAGInfo.cpp
	SystemZShortenInst.cpp			SystemZShortenInst.cpp
	SystemZSubtarget.cpp			SystemZSubtarget.cpp
	SystemZTargetMachine.cpp			SystemZTargetMachine.cpp
	SystemZTargetTransformInfo.cpp			SystemZTargetTransformInfo.cpp
	SystemZTDC.cpp			SystemZTDC.cpp
	)			)

	add_subdirectory(AsmParser)			add_subdirectory(AsmParser)
	add_subdirectory(Disassembler)			add_subdirectory(Disassembler)
	add_subdirectory(InstPrinter)			add_subdirectory(InstPrinter)
	add_subdirectory(TargetInfo)			add_subdirectory(TargetInfo)
	add_subdirectory(MCTargetDesc)			add_subdirectory(MCTargetDesc)

lib/Target/SystemZ/SystemZ.td

	//===-- SystemZ.td - Describe the SystemZ target machine ------ tblgen --===//			//===-- SystemZ.td - Describe the SystemZ target machine ------ tblgen --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Target-independent interfaces which we are implementing			// Target-independent interfaces which we are implementing
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "llvm/Target/Target.td"			include "llvm/Target/Target.td"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// SystemZ subtargets scheduling models.
				//===----------------------------------------------------------------------===//
				include "SystemZSchedule.td"

				//===----------------------------------------------------------------------===//
	// SystemZ supported processors and features			// SystemZ supported processors and features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "SystemZProcessors.td"			include "SystemZProcessors.td"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Register file description			// Register file description
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	Show All 39 Lines

lib/Target/SystemZ/SystemZHazardRecognizer.h

This file was added.

				//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file declares a hazard recognizer for the SystemZ scheduler.
				//
				// This class is used by the SystemZ scheduling strategy to maintain
				// the state during scheduling, and provide cost functions for
				// scheduling candidates. This includes:
				//
				// * Decoder grouping. A decoder group can maximally hold 3 uops, and
				// instructions that always begin a new group should be scheduled when
				// the current decoder group is empty.
				// * Processor resources usage. It is beneficial to balance the use of
				// resources.
				//
				// ===---------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H
				#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H

				#include "SystemZSubtarget.h"
				#include "llvm/CodeGen/MachineFunction.h"
				#include "llvm/CodeGen/ScheduleHazardRecognizer.h"
				#include "llvm/CodeGen/ScheduleDAG.h"
				#include "llvm/CodeGen/TargetSchedule.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/MC/MCInstrDesc.h"
				#include "llvm/Support/raw_ostream.h"
				#include <string>

				namespace llvm {

				/// SystemZHazardRecognizer maintains the state during scheduling.
				class SystemZHazardRecognizer : public ScheduleHazardRecognizer {

				static TargetSchedModel SchedModel;
				static const SystemZInstrInfo *TII;

				/// Keep track of the number of decoder slots used in the current
				/// decoder group.
				unsigned CurrGroupSize;

				/// The tracking of resources here are quite similar to the common
				/// code use of a critical resource. However, z13 differs in the way
				/// that it has two processor sides which may be interesting to
				/// model in the future (a work in progress).

				/// Counters for the number of uops scheduled per processor
				/// resource.
				SmallVector<int, 0> ProcResourceCounters;

				/// This is the resource with the greatest queue, which the
				/// scheduler tries to avoid.
				unsigned CriticalResourceIdx;

				/// Initialize hazard recognizer before scheduling a region.
				void init();

				/// Return MCSchedClassDesc for SU.
				inline const MCSchedClassDesc getSchedClassDesc(const SUnit SU) const {
				return SchedModel.resolveSchedClass(SU->getInstr());
				}

				/// Return MCSchedClassDesc for MI.
				inline const MCSchedClassDesc getSchedClassDesc(const MachineInstr MI) const {
				return SchedModel.resolveSchedClass(MI);
				}

				/// Return the number of decoder slots MI requires.
				inline unsigned getNumDecoderSlots(const MachineInstr *MI) const;

				/// Return true if MI fits into current decoder group.
				bool fitsIntoCurrentGroup(MachineInstr *MI) const;

				/// Two decoder groups per cycle are formed (for z13), meaning 2x3
				/// instructions. This function returns a number between 0 and 5,
				/// representing the current decoder slot of the current cycle.
				unsigned getCurrCycleIdx();

				/// LastFPdOpCycleIdx stores the numbeer returned by getCurrCycleIdx()
				/// when a stalling operation is scheduled (which uses the FPd resource).
				unsigned LastFPdOpCycleIdx;

				/// A counter of decoder groups scheduled.
				unsigned GrpCount;

				unsigned getCurrGroupSize() {return CurrGroupSize;};

				/// Start next decoder group.
				void nextGroup(bool DbgOutput = true);

				/// Clear all counters for processor resources.
				void clearProcResCounters();

				/// With the goal of alternating processor sides for stalling (FPd)
				/// ops, return true if it seems good to schedule an FPd op next.
				bool isFPdOpPreferred_distance(const SUnit *SU);

				/// Update the scheduler state by processing MI.
				void EmitInstruction(MachineInstr MI, SUnit SU = nullptr);

				public:
				SystemZHazardRecognizer();

				/// Initializes static data members, such as the SchedModel.
				static void initStatic(MachineFunction *MF);

				uweigandUnsubmitted Done Reply Inline Actions Seems you forgot to update the comment when changing the code :-) uweigand: Seems you forgot to update the comment when changing the code :-)
				HazardType getHazardType(SUnit *m, int Stalls = 0) override;
				void Reset() override;
				void EmitInstruction(SUnit *SU) override {
				EmitInstruction(SU->getInstr(), SU);
				}

				// Cost functions used by SystemZPostRASchedStrategy while
				// evaluating candidates.
				uweigandUnsubmitted Not Done Reply Inline Actions Why does groupingCost use the return value while resourcesCost uses an output parameter? uweigand: Why does groupingCost use the return value while resourcesCost uses an output parameter?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions During experiments I have also returned a cost here for the 'other' processor side. I guess I could change that back now until it's needed again...? jonpa: During experiments I have also returned a cost here for the 'other' processor side. I guess I…

				/// Return the cost of decoder grouping for SU. If SU must start a
				/// new decoder group, this is negative if this fits the schedule or
				/// positive if it would mean ending a group prematurely. For normal
				/// instructions this returns 0.
				int groupingCost(const SUnit *SU) const;

				/// Set Cost to a positive value if it would be better to wait with
				/// SU, in regards to processor resources usage. A negative value
				/// means it would be good to schedule SU next.
				void resourcesCost(const SUnit *SU, int &Cost);

				#ifndef NDEBUG
				// Debug dumping.
				std::string CurGroupDbg; // current group as text
				void dumpSU(SUnit *SU, raw_ostream &OS) const;
				void dumpMI(MachineInstr *MI, raw_ostream &OS) const;
				void dumpCurrGroup(std::string Msg = "") const;
				void dumpProcResourceCounters() const;
				#endif
				};

				} // namespace llvm

				#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H */

lib/Target/SystemZ/SystemZHazardRecognizer.cpp

This file was added.

				//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines a hazard recognizer for the SystemZ scheduler.
				//
				// This class is used by the SystemZ scheduling strategy to maintain
				// the state during scheduling, and provide cost functions for
				// scheduling candidates. This includes:
				//
				// * Decoder grouping. A decoder group can maximally hold 3 uops, and
				// instructions that always begin a new group should be scheduled when
				// the current decoder group is empty.
				// * Processor resources usage. It is beneficial to balance the use of
				// resources.
				//
				// ===---------------------------------------------------------------------===//

				#include "SystemZHazardRecognizer.h"
				#include "llvm/ADT/Statistic.h"

				using namespace llvm;

				#define DEBUG_TYPE "misched"

				// This is the limit of processor resource usage at which the
				// scheduler should try to look for other instructions (not using the
				// critical resource).
				cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,
				cl::desc("The OOO window for processor "
				"resources during scheduling."),
				cl::init(8));

				TargetSchedModel SystemZHazardRecognizer::SchedModel;
				const SystemZInstrInfo *SystemZHazardRecognizer::TII;

				void SystemZHazardRecognizer::initStatic(MachineFunction *MF) {
				const SystemZSubtarget &ST =
				static_cast<const SystemZSubtarget&>(MF->getSubtarget());
				SchedModel.init(ST.getSchedModel(), &ST, ST.getInstrInfo());
				uweigandUnsubmitted Done Reply Inline Actions I guess instead of this we could simply use getInstrInfo on the SchedModel. uweigand: I guess instead of this we could simply use getInstrInfo on the SchedModel.
				TII = static_cast<const SystemZInstrInfo *>(ST.getInstrInfo());
				}

				uweigandUnsubmitted Done Reply Inline Actions This looks OK to me. However, with this change we should now give a scheduling class to the DirectiveInsn pesudo instruction -- these do emit some instruction, we just don't know which one, so it should probably be modeled as some "generic" instruction. uweigand: This looks OK to me. However, with this change we should now give a scheduling class to the…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I used an empty InstRW construct, which seems to do the job. jonpa: I used an empty InstRW construct, which seems to do the job.
				SystemZHazardRecognizer::
				SystemZHazardRecognizer() {
				MaxLookAhead = 1; // Set to 1 to indicate 'enabled'.
				init();
				}

				unsigned SystemZHazardRecognizer::
				getNumDecoderSlots(const MachineInstr *MI) const {
				const MCSchedClassDesc *SC = getSchedClassDesc(MI);
				if (!SC->isValid())
				return 1;

				if (SC->BeginGroup) {
				if (!SC->EndGroup)
				return 2; // Cracked instruction
				else
				return 3; // Expanded/group-alone instruction
				}

				if (SC->NumMicroOps == 0)
				return 0; // e.g. IMPLICIT_DEF -- will not make impact in output.

				return 1; // Normal instruction
				}

				unsigned SystemZHazardRecognizer::getCurrCycleIdx() {
				unsigned Idx = CurrGroupSize;
				if (GrpCount % 2)
				Idx += 3;
				return Idx;
				}

				ScheduleHazardRecognizer::HazardType SystemZHazardRecognizer::
				getHazardType(SUnit *m, int Stalls) {
				return (fitsIntoCurrentGroup(m->getInstr()) ? NoHazard : Hazard);
				}

				uweigandUnsubmitted Done Reply Inline Actions Is there a reason why this is a separate function and no just done directly in ::Reset()? uweigand: Is there a reason why this is a separate function and no just done directly in ::Reset()?
				void SystemZHazardRecognizer::Reset() {
				init();
				}

				void SystemZHazardRecognizer::init() {
				CurrGroupSize = 0;
				clearProcResCounters();
				LastFPdOpCycleIdx = UINT_MAX;
				DEBUG(CurGroupDbg = "";);
				}

				bool
				SystemZHazardRecognizer::fitsIntoCurrentGroup(MachineInstr *MI) const {
				const MCSchedClassDesc *SC = getSchedClassDesc(MI);
				if (!SC->isValid())
				return true;

				// A cracked instruction only fits into schedule if the current
				// group is empty.
				if (SC->BeginGroup)
				return (CurrGroupSize == 0);

				// Since a full group is handled immediately in EmitInstruction(),
				// SU should fit into current group. NumSlots should be 1 or 0,
				// since it is not a cracked or expanded instruction.
				assert ((getNumDecoderSlots(MI) <= 1) && (CurrGroupSize < 3) &&
				"Expected normal instruction to fit in non-full group!");

				return true;
				}

				void SystemZHazardRecognizer::nextGroup(bool DbgOutput) {
				if (CurrGroupSize > 0) {
				DEBUG(dumpCurrGroup("Completed decode group"));
				DEBUG(CurGroupDbg = "";);

				GrpCount++;

				// Reset counter for next group.
				CurrGroupSize = 0;

				// Decrease counters for execution units by one.
				for (unsigned i = 0; i < SchedModel.getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0)
				ProcResourceCounters[i]--;

				// Clear CriticalResourceIdx if it is now below the threshold.
				if (CriticalResourceIdx != UINT_MAX &&
				(ProcResourceCounters[CriticalResourceIdx] <=
				ProcResCostLim))
				CriticalResourceIdx = UINT_MAX;
				}

				DEBUG(if (DbgOutput)
				dumpProcResourceCounters(););
				}

				#ifndef NDEBUG // Debug output
				void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const{
				OS << "SU(" << SU->NodeNum << "):";
				dumpMI(SU->getInstr(), OS);
				if (SU->hasReservedResource)
				OS << "/ReservedRes";
				}

				void SystemZHazardRecognizer::dumpMI(MachineInstr *MI, raw_ostream &OS) const{
				OS << TII->getName(MI->getOpcode());
				const MCSchedClassDesc *SC = getSchedClassDesc(MI);
				if (!SC->isValid())
				return;
				uweigandUnsubmitted Done Reply Inline Actions This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name? uweigand: This looks a bit ad-hoc ... isn't there a more generic way to find the shorter name?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Fixed by using string::substr()/resize() instead. Now all units should really be named per a Z13_XXXUnit pattern. jonpa: Fixed by using string::substr()/resize() instead. Now all units should really be named per a…

				for (TargetSchedModel::ProcResIter
				PI = SchedModel.getWriteProcResBegin(SC),
				PE = SchedModel.getWriteProcResEnd(SC); PI != PE; ++PI) {
				const MCProcResourceDesc &PRD =
				*SchedModel.getProcResource(PI->ProcResourceIdx);
				std::string U(PRD.Name);

				// trim e.g. Z13_FXaUnit -> FXa
				if (U.find("FXa") != std::string::npos)
				OS << "/FXa";
				if (U.find("FXb") != std::string::npos)
				OS << "/FXb";
				if (U.find("FPU") != std::string::npos)
				OS << "/FPU";
				if (U.find("FXU") != std::string::npos)
				OS << "/FXU";
				if (U.find("LSU") != std::string::npos)
				OS << "/LSU";
				if (U.find("VBU") != std::string::npos)
				OS << "/VBU";
				if (U.find("VecDFx") != std::string::npos)
				OS << "/VecDFx";
				if (U.find("VecXPm") != std::string::npos)
				OS << "/VecXsPm";
				if (U.find("VecStr") != std::string::npos)
				OS << "/VecStr";
				if (U.find("VecMul") != std::string::npos)
				OS << "/VecMul";
				if (U.find("VecBF") != std::string::npos)
				OS << "/VecBF";
				if (U.find("VecDF") != std::string::npos)
				OS << "/VecDF";
				if (U.find("VecFPd") != std::string::npos)
				OS << "/VecFPd";

				if (PI->Cycles > 1)
				OS << "(" << PI->Cycles << "cyc)";
				}

				if (SC->NumMicroOps > 1)
				OS << "/" << SC->NumMicroOps << "uops";
				if (SC->BeginGroup && SC->EndGroup)
				OS << "/GroupsAlone";
				else if (SC->BeginGroup)
				OS << "/BeginsGroup";
				else if (SC->EndGroup)
				OS << "/EndsGroup";
				}

				void SystemZHazardRecognizer::dumpCurrGroup(std::string Msg) const {
				dbgs() << "+++ " << Msg;
				dbgs() << ": ";

				if (CurGroupDbg.empty())
				dbgs() << " <empty>\n";
				else {
				dbgs() << "{ " << CurGroupDbg << " }";
				dbgs() << " (" << CurrGroupSize << " decoder slot"
				<< (CurrGroupSize > 1 ? "s":"")
				<< ")\n";
				}
				}

				void SystemZHazardRecognizer::dumpProcResourceCounters() const {
				bool any = false;

				for (unsigned i = 0; i < SchedModel.getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0) {
				any = true;
				break;
				}

				if (!any)
				return;

				dbgs() << "+++ Resource counters:\n";
				for (unsigned i = 0; i < SchedModel.getNumProcResourceKinds(); ++i)
				if (ProcResourceCounters[i] > 0) {
				dbgs() << "+++ Extra schedule for execution unit "
				<< SchedModel.getProcResource(i)->Name
				<< ": " << ProcResourceCounters[i] << "\n";
				any = true;
				}
				}
				#endif //NDEBUG

				void SystemZHazardRecognizer::clearProcResCounters() {
				ProcResourceCounters.assign(SchedModel.getNumProcResourceKinds(), 0);
				CriticalResourceIdx = UINT_MAX;
				}

				// Update state with MI as next instruction. If SU is null, this
				// is during the advancing past instructions not subject to scheduling.
				void SystemZHazardRecognizer::
				EmitInstruction(MachineInstr MI, SUnit SU) {
				assert (SU == nullptr \|\| MI == SU->getInstr());
				const MCSchedClassDesc *SC = getSchedClassDesc(MI);
				DEBUG( dumpCurrGroup("Decode group before emission"););

				// If scheduling an MI that must begin a new decoder group, move on
				// to next group.
				if (!fitsIntoCurrentGroup(MI))
				nextGroup();

				DEBUG( if (SU != nullptr) {
				dbgs() << "+++ HazardRecognizer emitting "; dumpSU(SU, dbgs());
				dbgs() << "\n";
				raw_string_ostream cgd(CurGroupDbg);
				if (CurGroupDbg.length())
				cgd << ", ";
				dumpSU(SU, cgd);
				} else {
				dbgs() << "+++ Advancing past: ";
				dumpMI(MI, dbgs());
				dbgs() << "\n";

				raw_string_ostream cgd(CurGroupDbg);
				if (CurGroupDbg.length())
				cgd << ", ";
				cgd << TII->getName(MI->getOpcode());
				});

				// After returning from a call, we don't know much about the state.
				if (MI->isCall()) {
				DEBUG (dbgs() << "+++ Clearing state after call.\n";);
				clearProcResCounters();
				LastFPdOpCycleIdx = UINT_MAX;
				CurrGroupSize += getNumDecoderSlots(MI);
				assert (CurrGroupSize <= 3);
				nextGroup();
				return;
				}

				// Increase counter for execution unit(s).
				for (TargetSchedModel::ProcResIter
				PI = SchedModel.getWriteProcResBegin(SC),
				PE = SchedModel.getWriteProcResEnd(SC); PI != PE; ++PI) {
				// Don't handle FPd together with the other resources.
				if (SchedModel.getProcResource(PI->ProcResourceIdx)->BufferSize == 0)
				continue;
				int &CurrCounter =
				ProcResourceCounters[PI->ProcResourceIdx];
				CurrCounter += PI->Cycles;
				// Check if this is now the new critical resource.
				if ((CurrCounter > ProcResCostLim) &&
				(CriticalResourceIdx == UINT_MAX \|\|
				(PI->ProcResourceIdx != CriticalResourceIdx &&
				CurrCounter >
				ProcResourceCounters[CriticalResourceIdx]))) {
				DEBUG( dbgs() << "+++ New critical resource: "
				<< SchedModel.getProcResource(PI->ProcResourceIdx)->Name
				<< "\n";);
				CriticalResourceIdx = PI->ProcResourceIdx;
				}
				}

				// Make note of an instruction that uses a blocking resource (FPd).
				if ((SU != nullptr && SU->hasReservedResource)) {
				LastFPdOpCycleIdx = getCurrCycleIdx();
				DEBUG (dbgs() << "+++ Last FPd cycle index: "
				<< LastFPdOpCycleIdx << "\n";);
				}

				// Insert MI into current group by increasing number of slots used
				// in current group.
				CurrGroupSize += getNumDecoderSlots(MI);
				assert (CurrGroupSize <= 3);

				// Check if current group is now full/ended. If so, move on to next
				// group to be ready to evaluate more candidates.
				if (CurrGroupSize == 3 \|\| SC->EndGroup)
				nextGroup();
				}

				int SystemZHazardRecognizer::groupingCost(const SUnit *SU) const {
				const MCSchedClassDesc *SC = getSchedClassDesc(SU);
				if (!SC->isValid())
				return 0;

				// If SU begins new group, it can either break a current group early
				// or fit naturally if current group is empty (negative cost).
				if (SC->BeginGroup) {
				if (CurrGroupSize)
				return 3 - CurrGroupSize;
				return -1;
				}

				// Similarly, a group-ending SU may either fit well (last in group), or
				// end the group prematurely.
				if (SC->EndGroup) {
				unsigned resultingGroupSize =
				(CurrGroupSize + getNumDecoderSlots(SU->getInstr()));
				if (resultingGroupSize < 3)
				return (3 - resultingGroupSize);
				return -1;
				}

				// Most instructions can be placed in any decoder slot.
				return 0;
				}

				bool SystemZHazardRecognizer::isFPdOpPreferred_distance(const SUnit *SU) {
				assert (SU->hasReservedResource);
				// If this is the first FPd op, it should be scheduled high.
				if (LastFPdOpCycleIdx == UINT_MAX)
				return true;
				// If this is not the first PFd op, it should go into the other side
				// of the processor to use the other FPd unit there. This should
				// generally happen if two FPd ops are placed with 2 other
				// instructions between them (modulo 6).
				if (LastFPdOpCycleIdx > getCurrCycleIdx())
				return ((LastFPdOpCycleIdx - getCurrCycleIdx()) == 3);
				return ((getCurrCycleIdx() - LastFPdOpCycleIdx) == 3);
				}

				void SystemZHazardRecognizer::
				resourcesCost(const SUnit *SU, int &Cost) {
				Cost = 0;

				const MachineInstr *MI = SU->getInstr();
				const MCSchedClassDesc *SC = getSchedClassDesc(MI);
				if (!SC->isValid())
				return;

				// For a FPd op, either return min or max value as indicated by the
				// distance to any prior FPd op.
				if (SU->hasReservedResource)
				Cost = (isFPdOpPreferred_distance(SU) ? INT_MIN : INT_MAX);
				// For other instructions, give a cost to the use of the critical resource.
				else if (CriticalResourceIdx != UINT_MAX) {
				for (TargetSchedModel::ProcResIter
				PI = SchedModel.getWriteProcResBegin(SC),
				PE = SchedModel.getWriteProcResEnd(SC); PI != PE; ++PI)
				if (PI->ProcResourceIdx == CriticalResourceIdx)
				Cost = PI->Cycles;
				}
				}

lib/Target/SystemZ/SystemZInstrInfo.h

Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	public:
bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,		bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify) const override;		bool AllowModify) const override;
unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
int *BytesRemoved = nullptr) const override;		int *BytesRemoved = nullptr) const override;
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const override;		int *BytesAdded = nullptr) const override;
bool analyzeCompare(const MachineInstr &MI, unsigned &SrcReg,		bool analyzeCompare(const MachineInstr &MI, unsigned &SrcReg,
		atrickUnsubmitted Not Done Reply Inline Actions FYI, the "new" machine model is meant to be flexible enough that you don't need to create your own hazard recognizer (you can add predicates and arbitrary pseudo machine resources). However, it's tricky to do that and fine just to use a hazard recognizer when you have complicated decode/issue group constraints. atrick: FYI, the "new" machine model is meant to be flexible enough that you don't need to create your…
unsigned &SrcReg2, int &Mask, int &Value) const override;		unsigned &SrcReg2, int &Mask, int &Value) const override;
bool optimizeCompareInstr(MachineInstr &CmpInstr, unsigned SrcReg,		bool optimizeCompareInstr(MachineInstr &CmpInstr, unsigned SrcReg,
unsigned SrcReg2, int Mask, int Value,		unsigned SrcReg2, int Mask, int Value,
const MachineRegisterInfo *MRI) const override;		const MachineRegisterInfo *MRI) const override;
bool isPredicable(MachineInstr &MI) const override;		bool isPredicable(MachineInstr &MI) const override;
bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,		bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,
unsigned ExtraPredCycles,		unsigned ExtraPredCycles,
BranchProbability Probability) const override;		BranchProbability Probability) const override;
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	unsigned getFusedCompare(unsigned Opcode,
SystemZII::FusedCompareType Type,		SystemZII::FusedCompareType Type,
const MachineInstr *MI = nullptr) const;		const MachineInstr *MI = nullptr) const;

// Emit code before MBBI in MI to move immediate value Value into		// Emit code before MBBI in MI to move immediate value Value into
// physical register Reg.		// physical register Reg.
void loadImmediate(MachineBasicBlock &MBB,		void loadImmediate(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
unsigned Reg, uint64_t Value) const;		unsigned Reg, uint64_t Value) const;

		// Sometimes, it is possible for the target to tell, even without
		// aliasing information, that two MIs access different memory
		// addresses. This function returns true if two MIs access different
		// memory addresses and false otherwise.
		bool
		areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb,
		AliasAnalysis *AA = nullptr) const override;
};		};
} // end namespace llvm		} // end namespace llvm

#endif		#endif

lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 1,510 Lines • ▼ Show 20 Lines	else if (SystemZ::isImmLH(Value)) {
Opcode = SystemZ::LLILH;		Opcode = SystemZ::LLILH;
Value >>= 16;		Value >>= 16;
} else {		} else {
assert(isInt<32>(Value) && "Huge values not handled yet");		assert(isInt<32>(Value) && "Huge values not handled yet");
Opcode = SystemZ::LGFI;		Opcode = SystemZ::LGFI;
}		}
BuildMI(MBB, MBBI, DL, get(Opcode), Reg).addImm(Value);		BuildMI(MBB, MBBI, DL, get(Opcode), Reg).addImm(Value);
}		}

		bool SystemZInstrInfo::
		areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb,
		AliasAnalysis *AA) const {

		if (!MIa.hasOneMemOperand() \|\| !MIb.hasOneMemOperand())
		return false;

		// If mem-operands show that the same address Value is used by both
		// instructions, check for non-overlapping offsets and widths. Not
		// sure if a register based analysis would be an improvement...

		MachineMemOperand MMOa = MIa.memoperands_begin();
		MachineMemOperand MMOb = MIb.memoperands_begin();
		const Value *VALa = MMOa->getValue();
		const Value *VALb = MMOb->getValue();
		bool SameVal = (VALa && VALb && (VALa == VALb));
		if (!SameVal) {
		const PseudoSourceValue *PSVa = MMOa->getPseudoValue();
		const PseudoSourceValue *PSVb = MMOb->getPseudoValue();
		if (PSVa && PSVb && (PSVa == PSVb))
		SameVal = true;
		}
		if (SameVal) {
		int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
		int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
		int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
		int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
		int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
		if (LowOffset + LowWidth <= HighOffset)
		return true;
		}

		return false;
		}

lib/Target/SystemZ/SystemZMachineScheduler.h

This file was added.

				//==-- SystemZMachineScheduler.h - SystemZ Scheduler Interface -- C++ ----==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// SystemZ MachineScheduler strategy.
				//
				//===----------------------------------------------------------------------===//

				#include "SystemZInstrInfo.h"
				#include "SystemZHazardRecognizer.h"
				#include "llvm/CodeGen/MachineScheduler.h"
				#include "llvm/Support/Debug.h"

				#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H
				#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H

				using namespace llvm;

				namespace llvm {

				/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.
				class SystemZPostRASchedStrategy : public MachineSchedStrategy {

				/// A candidate during instruction evaluation.
				struct Candidate {
				SUnit *SU;

				/// The decoding cost.
				int GroupingCost;

				/// The processor resources cost.
				int ResourcesCost;

				Candidate() : SU(nullptr), GroupingCost(0), ResourcesCost(0) {}
				Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec);

				// Compare two candidates.
				bool operator<(const Candidate &other);

				// Check if this node is free of cost ("as good as any").
				bool inline noCost() {
				return (GroupingCost <= 0 && !ResourcesCost);
				}
				};

				// A sorter that makes sure that SUs are considered in the best
				// order.
				struct SUSorter {
				bool operator() (const SUnit lhs, const SUnit rhs) const {
				if (lhs->affectsGrouping && !rhs->affectsGrouping)
				return true;
				if (!lhs->affectsGrouping && rhs->affectsGrouping)
				return false;
				uweigandUnsubmitted Done Reply Inline Actions As discussed offline, we really need to get rid of the gobal/static variable here. I note that there is quite a bit of similarily between this "preliminary" sorter and the final sorter in Candidate. Maybe we should actually store "preliminary" candidates in the Available list (with GroupingCost and ResourceCost only set to 0 or 1 depending on whether grouping or reserved resources are involved), and the update the cost parameters with actual values once we known them? We then might even be able to reuse the same comparison routine ... uweigand: As discussed offline, we really need to get rid of the gobal/static variable here. I note…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I gave this a try, but it thought it was a bit messy to flip the sorting variables back and forth inside a set of Candidates. I instead use the isScheduleHigh flag for groupers / FPd ops, which simplifies the SUSorter method while also eliminating the static variable. The iteration in pickNode() should be nearly unaffected, since these nodes are quite rare. jonpa: I gave this a try, but it thought it was a bit messy to flip the sorting variables back and…
				uweigandUnsubmitted Not Done Reply Inline Actions OK, that makes sense. uweigand: OK, that makes sense.

				if (lhs->hasReservedResource && !rhs->hasReservedResource)
				return true;
				if (!lhs->hasReservedResource && rhs->hasReservedResource)
				return false;

				if (lhs->getHeight() > rhs->getHeight())
				return true;
				else if (lhs->getHeight() < rhs->getHeight())
				return false;

				return (lhs->NodeNum < rhs->NodeNum);
				}
				};
				// A set of SUs with a sorter and dump method.
				struct SUSet : std::set<SUnit*, SUSorter> {
				#ifndef NDEBUG
				void dump(SystemZHazardRecognizer &HazardRec);
				#endif
				};

				/// The set of available SUs to schedule next.
				SUSet Available;

				// HazardRecognizer that tracks the scheduler state for the current
				// region.
				SystemZHazardRecognizer HazardRec;

				public:
				SystemZPostRASchedStrategy(const MachineSchedContext *C);

				/// PostRA scheduling does not track pressure.
				bool shouldTrackPressure() const override { return false; }

				/// Initialize the strategy after building the DAG for a new region.
				void initialize(ScheduleDAGMI *dag) override;

				/// Pick the next node to schedule, or return NULL.
				SUnit *pickNode(bool &IsTopNode) override;

				/// ScheduleDAGMI has scheduled an instruction - tell HazardRec
				/// about it.
				void schedNode(SUnit *SU, bool IsTopNode) override;

				/// SU has had all predecessor dependencies resolved. Put it into
				/// Available.
				void releaseTopNode(SUnit *SU) override;

				/// Currently only scheduling top-down, so this method is empty.
				void releaseBottomNode(SUnit *SU) override {};
				};

				} // namespace llvm

				#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H */

lib/Target/SystemZ/SystemZMachineScheduler.cpp

This file was added.

				//==-- Systemzmachinescheduler.h - SystemZ Scheduler Interface -- C++ ----==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// -------------------------- Post RA scheduling ---------------------------- //
				// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into
				// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
				// implementation that looks to optimize decoder grouping and balance the
				// usage of processor resources.
				//===----------------------------------------------------------------------===//

				#include "SystemZMachineScheduler.h"

				using namespace llvm;

				#define DEBUG_TYPE "misched"

				#ifndef NDEBUG
				// Print the set of SUs
				void SystemZPostRASchedStrategy::SUSet::
				dump(SystemZHazardRecognizer &HazardRec) {
				dbgs() << "{";
				for (auto &SU : *this) {
				HazardRec.dumpSU(SU, dbgs());
				if (SU != *rbegin())
				dbgs() << ", ";
				}
				dbgs() << "}\n";
				}
				#endif

				SystemZPostRASchedStrategy::
				SystemZPostRASchedStrategy(const MachineSchedContext *C) {
				SystemZHazardRecognizer::initStatic(C->MF);
				}

				void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) {
				HazardRec.Reset();
				}

				// Pick the next node to schedule.
				SUnit *SystemZPostRASchedStrategy::pickNode(bool &IsTopNode) {
				// Only scheduling top-down.
				IsTopNode = true;

				if (Available.empty())
				return nullptr;

				// If only one choice, return it.
				if (Available.size() == 1) {
				DEBUG (dbgs() << "+++ Only one: ";
				HazardRec.dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);
				return *Available.begin();
				}

				// All nodes that are possible to schedule are stored by in the
				// Available set.
				DEBUG(dbgs() << "+++ Available: "; Available.dump(HazardRec););

				Candidate Best;
				for (auto *SU : Available) {

				// SU is the next candidate to be compared against current Best.
				Candidate c(SU, HazardRec);

				// Remeber which SU is the best candidate.
				if (Best.SU == nullptr \|\| c < Best) {
				Best = c;
				DEBUG(dbgs() << "+++ Best sofar: ";
				HazardRec.dumpSU(Best.SU, dbgs());
				if (Best.GroupingCost != 0)
				dbgs() << "\tGrouping cost:" << Best.GroupingCost;
				if (Best.ResourcesCost != 0)
				dbgs() << " Resource cost:" << Best.ResourcesCost;
				dbgs() << " Height:" << Best.SU->getHeight();
				dbgs() << "\n";);
				}

				if (Best.noCost())
				break;
				}

				assert (Best.SU != nullptr);
				return Best.SU;
				}

				SystemZPostRASchedStrategy::Candidate::
				Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec) : Candidate() {
				SU = SU_;

				// Check the grouping cost. For a node that must begin / end a
				// group, it is positive if it would do so prematurely, or negative
				// if it would fit naturally into the schedule.
				GroupingCost = HazardRec.groupingCost(SU);

				// Check the resources cost for this SU.
				HazardRec.resourcesCost(SU, ResourcesCost);
				}

				bool SystemZPostRASchedStrategy::Candidate::
				operator<(const Candidate &other) {

				// Check decoder grouping.
				if (GroupingCost < other.GroupingCost)
				return true;
				if (GroupingCost > other.GroupingCost)
				return false;

				// Compare the use of resources.
				if (ResourcesCost < other.ResourcesCost)
				return true;
				if (ResourcesCost > other.ResourcesCost)
				return false;

				// Higher SU is otherwise generally better.
				if (SU->getHeight() > other.SU->getHeight())
				return true;
				if (SU->getHeight() < other.SU->getHeight())
				return false;

				// If all same, fall back to original order.
				if (SU->NodeNum < other.SU->NodeNum)
				return true;

				return false;
				}

				void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {
				DEBUG(dbgs() << "+++ Scheduling SU(" << SU->NodeNum << ")\n";);

				// Remove SU from Available set and update HazardRec.
				Available.erase(SU);
				HazardRec.EmitInstruction(SU);
				}

				void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {
				// Put all released SUs in the Available set.
				Available.insert(SU);
				}

lib/Target/SystemZ/SystemZProcessors.td

	Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	def FeatureVector : SystemZFeature<			def FeatureVector : SystemZFeature<
	"vector", "Vector",			"vector", "Vector",
	"Assume that the vectory facility is installed"			"Assume that the vectory facility is installed"
	>;			>;
	def FeatureNoVector : SystemZMissingFeature<"Vector">;			def FeatureNoVector : SystemZMissingFeature<"Vector">;

	def : Processor<"generic", NoItineraries, []>;			def : Processor<"generic", NoItineraries, []>;
	def : Processor<"z10", NoItineraries, []>;			def : Processor<"z10", NoItineraries, []>;
	def : Processor<"z196", NoItineraries,			def : ProcessorModel<"z196", Z196Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1]>;			FeatureFastSerialization, FeatureInterlockedAccess1]>;
	def : Processor<"zEC12", NoItineraries,			def : ProcessorModel<"zEC12", ZEC12Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1,			FeatureFastSerialization, FeatureInterlockedAccess1,
	FeatureMiscellaneousExtensions,			FeatureMiscellaneousExtensions,
	FeatureTransactionalExecution, FeatureProcessorAssist]>;			FeatureTransactionalExecution, FeatureProcessorAssist]>;
	def : Processor<"z13", NoItineraries,
				def : ProcessorModel<"z13", Z13Model,
	[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,			[FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
	FeatureFPExtension, FeaturePopulationCount,			FeatureFPExtension, FeaturePopulationCount,
	FeatureFastSerialization, FeatureInterlockedAccess1,			FeatureFastSerialization, FeatureInterlockedAccess1,
	FeatureMiscellaneousExtensions,			FeatureMiscellaneousExtensions,
	FeatureTransactionalExecution, FeatureProcessorAssist,			FeatureTransactionalExecution, FeatureProcessorAssist,
	FeatureVector, FeatureLoadStoreOnCond2]>;			FeatureVector, FeatureLoadStoreOnCond2]>;

lib/Target/SystemZ/SystemZSchedule.td

This file was added.

				//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----- tblgen --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				// Scheduler resources

				// These three resources are used to express decoder grouping rules.
				// The number of decoder slots needed by an instructions is normally
				// one. For a cracked instruction (BeginGroup && !EndGroup) it is
				// two. Expanded instructions (BeginGroup && EndGroup) group alone.
				def GroupAlone : SchedWrite;
				def BeginGroup : SchedWrite;
				def EndGroup : SchedWrite;

				// Latencies, to make code a bit neater. If more than one resource is
				// used for an instruction, the greatest latency (not the sum) will be
				// output by Tablegen. Therefore, in such cases one of these resources
				// is needed.
				def Lat2 : SchedWrite;
				def Lat3 : SchedWrite;
				def Lat4 : SchedWrite;
				def Lat5 : SchedWrite;
				def Lat6 : SchedWrite;
				def Lat7 : SchedWrite;
				def Lat8 : SchedWrite;
				def Lat9 : SchedWrite;
				def Lat10 : SchedWrite;
				def Lat11 : SchedWrite;
				def Lat12 : SchedWrite;
				def Lat15 : SchedWrite;
				def Lat20 : SchedWrite;
				def Lat30 : SchedWrite;

				// Fixed-point
				def FXa : SchedWrite;
				def FXb : SchedWrite;
				def FXU : SchedWrite;

				// Load/store unit
				def LSU : SchedWrite;

				// Model a return without latency, otherwise if-converter will model
				// extra cost and abort (currently there is an assert that checks that
				// all instructions have at least one uop).
				def LSU_lat1 : SchedWrite;

				// Floating point unit (zEC12 and earlier)
				def FPU : SchedWrite;

				// Vector sub units (z13)
				def VecBF : SchedWrite;
				def VecDF : SchedWrite;
				def VecFPd : SchedWrite; // Blocking BFP div/sqrt unit.
				def VecMul : SchedWrite;
				def VecStr : SchedWrite;
				def VecXsPm : SchedWrite;

				// Virtual branching unit
				def VBU : SchedWrite;


				include "SystemZScheduleZ13.td"
				include "SystemZScheduleZEC12.td"
				include "SystemZScheduleZ196.td"

lib/Target/SystemZ/SystemZScheduleZ13.td

This file was added.

				//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----- tblgen --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for Z13 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def Z13Model : SchedMachineModel {

				let IssueWidth = 6; // 2 * 3 instructions decoded per cycle.
				let MicroOpBufferSize = 60; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;

				// This model does not include operand specific information.
				let CompleteModel = 0;
				uweigandUnsubmitted Done Reply Inline Actions I think this is the default setting, so we can just leave it out here. uweigand: I think this is the default setting, so we can just leave it out here.
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions aah, right. jonpa: aah, right.
				}
				uweigandUnsubmitted Done Reply Inline Actions We should really try to get this complete, so that instructions added in the future don't accidentally lack scheduling information. How difficult would it be to get there? uweigand: We should really try to get this complete, so that instructions added in the future don't…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set CompleteModel (the reason I did not do that before is that AFAICR, this flag then also demanded modeling of operand writes or something of that sort). Worthy of mentioning is that this triggers an error during build for any instruction missing scheduling input for all subtargets. In a debug build, TargetSchedule.cpp will then cause an abort during compilation if the subtarget does not have scheduling input for an emitted instruction. jonpa: I added the hasNoSchedulingInfo flag on the appropriate instructions, and then set…
				uweigandUnsubmitted Not Done Reply Inline Actions Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in the not-too-distant future we'll have completed the instruction set for the old processors anyway ... uweigand: Great, thanks! I do think we indeed want the error, even for older subtargets. Hopefully in…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I found that my previous statement on the compile time checking for scheduling input was not actually correct - this does not really cover all instructions: Currently asserts trigger only if All subtargets have omitted the instruction from Schedule .td files. TableGen catches this, and only this because it isn't clever enough to know if a given subtarget (with a missing sched class for an instruction), actually supports that instruction or not. An instruction has a def operand and the sched class does not have a WriteLatency entry for it. This is a specialized assert (in computeOperandLatency()) which doesn't cover all instructions - I could for instance remove the InstRW for a compare and not get any assert triggering. So there really isn't any general assert that checks that for a subtarget with a CompleteModel, all instructions actually emitted have a sane scheduling class. What happened if I removed the InstRW for an instruction, was that it got its own (valid) schedclass for the subtarget, but with just 0 values. I think we could catch the error of forgetting a subtarget/instruction sched annotation with an assert in the scheduler that demands at least one uop for any target instruction. This has at least worked well during my experiments previously. Should I add this just in the SystemZ backend? Or could it be part of the common code somewhere? (I am thinking this should be done both pre-ra and post-ra). Or is there any reason not to demand this that I have missed? jonpa: I found that my previous statement on the compile time checking for scheduling input was not…
				uweigandUnsubmitted Not Done Reply Inline Actions I don't think doing it in the backend is the right place. In the backend, you only see the instructions that the code being compiled happens to use; and when you do find an error there, there's not much you can do. The right place does seem to be TableGen. And in fact, I had interpreted the code in CodeGenSchedModels::checkCompleteness to do just that check. If this doesn't work as expected, it probably ought to be fixed there. But in any case this is a separate problem and shouldn't hold up this patch. uweigand: I don't think doing it in the backend is the right place. In the backend, you only see the…

				let SchedModel = Z13Model in {

				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<BeginGroup, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				atrickUnsubmitted Not Done Reply Inline Actions In-order scheduling with multiple functional units of the same type is somewhat broken in the generic scheduler. ReservedCycles is only tracking a worst-cast resource availability across all units. It should really be a two-dimensional array. I know this has been fixed out-of-tree at least one target but never pushed back. It would be great if you fixed that! atrick: In-order scheduling with multiple functional units of the same type is somewhat broken in the…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions This unit is handled as a special case by the HazardRecognizer, I guess partly because I couldn't see that the code - as you say - did what I wanted. That would be interesting to fix... could perhaps this out-of-tree target push this possibly? jonpa: This unit is handled as a special case by the HazardRecognizer, I guess partly because I…
				let NumMicroOps = 0;
				atrickUnsubmitted Not Done Reply Inline Actions It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree code. Out-of-tree work usually leads to problems. On the other hand, making it easy to write custom, possibly out-of-tree schedulers was a major goal of MI scheduling. atrick: It wasn't a complete/general fix. But yes, I'll encourage anyone I can to improve the in-tree…
				let EndGroup = 1;
				}
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def Z13_FXaUnit : ProcResource<2>;
				def Z13_FXbUnit : ProcResource<2>;
				def Z13_LSUnit : ProcResource<2>;
				def Z13_VecBFUnit : ProcResource<2>;
				def Z13_VecDFUnit : ProcResource<2>;
				def Z13_VecFPdUnit : ProcResource<2> { let BufferSize = 0; /* blocking */ }
				def Z13_VecMulUnit : ProcResource<2>;
				def Z13_VecStrUnit : ProcResource<2>;
				def Z13_VecXsPmUnit : ProcResource<2>;
				def Z13_VBUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXa, [Z13_FXaUnit]> { let Latency = 1; }
				def : WriteRes<FXb, [Z13_FXbUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [Z13_LSUnit]> { let Latency = 4; }
				def : WriteRes<VecBF, [Z13_VecBFUnit]> { let Latency = 8; }
				def : WriteRes<VecDF, [Z13_VecDFUnit]>;
				def : WriteRes<VecFPd, [Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit,
				Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit]>
				{ let Latency = 30; }
				def : WriteRes<VecMul, [Z13_VecMulUnit]> { let Latency = 5; }
				def : WriteRes<VecStr, [Z13_VecStrUnit]> { let Latency = 4; }
				def : WriteRes<VecXsPm, [Z13_VecXsPmUnit]> { let Latency = 3; }
				def : WriteRes<VBU, [Z13_VBUnit]>; // Virtual Branching Unit

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				uweigandUnsubmitted Done Reply Inline Actions Ideally, the ordering in these files should mostly correspond to the ordering in the original InstrInfo files, just to make them easier to find ... uweigand: Ideally, the ordering in these files should mostly correspond to the ordering in the original…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions I have reorganized the files completely to match the InstrInfo file sections. jonpa: I have reorganized the files completely to match the InstrInfo file sections.
				uweigandUnsubmitted Not Done Reply Inline Actions Excellent, it is much more readable now! uweigand: Excellent, it is much more readable now!
				// Call
				uweigandUnsubmitted Not Done Reply Inline Actions Also, it is somewhat annoying that we need to list not just the basic instruction definitions, but all the various aliases as well. I'm wondering if there isn't some what to annotate the Alias definition in the main file with the opcode the alias will be resolved in the end, so that can be used for scheduling purposes ... uweigand: Also, it is somewhat annoying that we need to list not just the basic instruction definitions…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Could not as of yet find any way to achieve this, but there might be some way... jonpa: Could not as of yet find any way to achieve this, but there might be some way...
				uweigandUnsubmitted Done Reply Inline Actions Oh well, if there's no straightforward way, it's OK the way it is now ... uweigand: Oh well, if there's no straightforward way, it's OK the way it is now ...
				def : InstRW<[VBU, FXa, FXa, Lat3, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[FXb], (instregex "CallB(C)?R$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[VBU], (instregex "CallBRCL$")>;
				def : InstRW<[VBU], (instregex "CallJG$")>;

				// Return
				def : InstRW<[FXb, EndGroup], (instregex "Return$")>;
				def : InstRW<[FXb], (instregex "CondReturn$")>;

				// Branch
				def : InstRW<[FXb], (instregex "B(C)?R$")>;
				def : InstRW<[VBU], (instregex "BRC(L)?$")>;
				def : InstRW<[FXa, EndGroup], (instregex "BRCT(G)?$")>;
				def : InstRW<[VBU], (instregex "J(G)?$")>;
				def : InstRW<[FXa, FXa, FXb, FXb, Lat4, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Compare and branch
				def : InstRW<[FXb], (instregex "C(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "CG(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "CL(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "CLG(I\|R)J$")>;
				def : InstRW<[FXb], (instregex "CG(R\|I)J$")>;
				def : InstRW<[FXb], (instregex "C(R\|I)J$")>;
				def : InstRW<[FXb], (instregex "CL(R\|I)J$")>;
				def : InstRW<[FXb], (instregex "CLR$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CGIB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CGRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CLRB(Call\|Return)?$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "CRB(Call\|Return)?$")>;

				// Serialize
				def : InstRW<[FXb, EndGroup], (instregex "Serialize$")>;

				// Trap instructions
				def : InstRW<[VBU], (instregex "(Cond)?Trap$")>;

				///// FIXED POINT

				// Addition
				def : InstRW<[FXa, LSU, Lat5], (instregex "A(Y)?$")>;
				def : InstRW<[FXa], (instregex "AIH$")>;
				def : InstRW<[FXa], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "AG$")>;
				def : InstRW<[FXa], (instregex "AGFI$")>;
				def : InstRW<[FXa], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXa], (instregex "AGR(K)?$")>;
				def : InstRW<[FXa], (instregex "AHI(K)?$")>;
				def : InstRW<[FXa], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXa], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXa], (instregex "ALGHSIK$")>;
				def : InstRW<[FXa], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXa], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXa], (instregex "ALR(K)?$")>;
				def : InstRW<[FXa], (instregex "AR(K)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "A(G)?SI$")>;

				// Logical addition with carry
				def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXa, Lat2, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXa, LSU, Lat6], (instregex "AGF$")>;
				def : InstRW<[FXa, Lat2], (instregex "AGFR$")>;

				// Add halfword
				def : InstRW<[FXa, LSU, Lat6], (instregex "AH(Y)?$")>;

				// Subtraction
				def : InstRW<[FXa, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXa], (instregex "SGR(K)?$")>;
				def : InstRW<[FXa], (instregex "SLFI$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXa], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXa], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXa], (instregex "SLR(K)?$")>;
				def : InstRW<[FXa], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXa, Lat2, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXa, LSU, Lat6], (instregex "SGF$")>;
				def : InstRW<[FXa, Lat2], (instregex "SGFR$")>;

				// Subtract halfword
				def : InstRW<[FXa, LSU, Lat6], (instregex "SH(Y)?$")>;

				// Multiply
				def : InstRW<[FXa, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXa, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXa, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXa, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXa, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXa, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXa, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXa, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXa, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXa, LSU, Lat9], (instregex "MH(Y)?$")>;

				// Divide
				def : InstRW<[FXa, Lat30, GroupAlone], (instregex "DSG(F)?R$")>;
				def : InstRW<[LSU, FXa, Lat30, GroupAlone], (instregex "DSG(F)?$")>;
				def : InstRW<[FXa, FXa, Lat20, GroupAlone], (instregex "DLR$")>;
				def : InstRW<[FXa, FXa, Lat30, GroupAlone], (instregex "DLGR$")>;
				def : InstRW<[FXa, FXa, LSU, Lat30, GroupAlone], (instregex "DL(G)?$")>;

				// And
				def : InstRW<[FXb, LSU, Lat5], (instregex "NTSTG$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				uweigandUnsubmitted Done Reply Inline Actions This is not an "And", it's a non-transactional store and should go with the transaction-related instructions. uweigand: This is not an "And", it's a non-transactional store and should go with the transaction-related…
				def : InstRW<[FXa], (instregex "NGR(K)?$")>;
				def : InstRW<[FXa], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXa], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "NILF(64)?$")>;
				def : InstRW<[FXa], (instregex "NILH(64)?$")>;
				def : InstRW<[FXa], (instregex "NILL(64)?$")>;
				def : InstRW<[FXa], (instregex "NR(K)?$")>;

				// Or
				def : InstRW<[FXa, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXa], (instregex "OGR(K)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXa], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXa], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "OILF(64)?$")>;
				def : InstRW<[FXa], (instregex "OILH(64)?$")>;
				def : InstRW<[FXa], (instregex "OILL(64)?$")>;
				def : InstRW<[FXa], (instregex "OR(K)?$")>;

				// Xor
				def : InstRW<[FXa, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXa], (instregex "XIFMux$")>;
				def : InstRW<[FXa], (instregex "XGR(K)?$")>;
				def : InstRW<[FXa], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "XILF(64)?$")>;
				def : InstRW<[FXa], (instregex "XR(K)?$")>;

				// Insert
				def : InstRW<[FXa, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXa], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXa], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXa], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXa], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXa], (instregex "IILF(64)?$")>;
				def : InstRW<[FXa], (instregex "IILH(64)?$")>;
				def : InstRW<[FXa], (instregex "IILL(64)?$")>;

				// And / Or / Xor character
				def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "(N\|O\|X)C$")>;

				// Shifts
				def : InstRW<[FXa], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXa], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXa, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXa], (instregex "RISBG(N\|32)?$")>;
				def : InstRW<[FXa], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "R(N\|O\|X)SBG$")>;

				// Extend
				def : InstRW<[FXa], (instregex "AEXT128_64$")>;
				def : InstRW<[FXa], (instregex "ZEXT128_(32\|64)$")>;

				// Find leftmost one
				def : InstRW<[FXa, Lat6, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXa, Lat3], (instregex "POPCNT$")>;

				// Compare
				def : InstRW<[FXb, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXb], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXb], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXb], (instregex "C(G)?R$")>;
				def : InstRW<[FXb], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXb], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXb], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXb], (instregex "CLGR$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXb], (instregex "CLIH$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXb], (instregex "CLR$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXb, LSU, Lat6], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXb, LSU, Lat6], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXa, FXb, LSU, Lat6, BeginGroup], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXb, LSU, Lat6], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXb, Lat2], (instregex "CGFR$")>;

				// Compare and swap
				def : InstRW<[FXa, FXb, LSU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				// Compare logical character
				def : InstRW<[FXb, LSU, LSU, Lat9, BeginGroup], (instregex "CLC$")>;

				// Compare and trap
				def : InstRW<[FXb], (instregex "C(G)?IT$")>;
				def : InstRW<[FXb], (instregex "C(G)?RT$")>;
				def : InstRW<[FXb], (instregex "CLG(I\|R)T$")>;
				def : InstRW<[FXb], (instregex "CLFIT$")>;
				def : InstRW<[FXb], (instregex "CLRT$")>;

				// Test under mask
				def : InstRW<[FXb, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXb], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXb], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXb], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXb], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXb], (instregex "TMLL(64)?$")>;

				// Load and test
				def : InstRW<[FXa, LSU, Lat5], (instregex "LT(G\|GF)?$")>;
				def : InstRW<[FXa], (instregex "LT(G\|GF)?R$")>;

				// Moves
				def : InstRW<[FXb, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[FXb, LSU, LSU, LSU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Move with key
				def : InstRW<[FXa, FXa, FXb, LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Pseudo -> reg move
				def : InstRW<[FXa], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXa], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXa], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXa], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXa], (instregex "SUBREG_TO_REG$")>;

				def : InstRW<[], (instregex "IMPLICIT_DEF$")>;

				// Loads (LSU)
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux\|CBB)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|F\|H\|FRL\|HRL)$")>;
				def : InstRW<[LSU], (instregex "LLH(RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "L(X\|128)$")>;

				// Loads (FXa)
				def : InstRW<[FXa, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[FXa], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXa], (instregex "LLG(C\|F\|H)R$")>;
				def : InstRW<[FXa], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXa], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "LLIL(F\|H\|L)$")>;
				def : InstRW<[FXa], (instregex "LA(Y\|RL)?$")>;
				def : InstRW<[FXa], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "LAX(G)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXa], (instregex "L(B\|G)R$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LGB$")>;
				def : InstRW<[FXa], (instregex "LGBR$")>;
				def : InstRW<[FXa], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LG(F\|H)$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LG(F\|H)RL$")>;
				def : InstRW<[FXa], (instregex "LG(F\|H)R$")>;
				def : InstRW<[FXa], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LH(H\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXa], (instregex "LHR$")>;
				def : InstRW<[FXa], (instregex "LR(Mux)?$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXa], (instregex "LRV(G)?R$")>;


				// Load GR from FPR
				def : InstRW<[FXb, Lat3], (instregex "LGDR$")>;

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[FXa], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXa, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXa, FXa, Lat2, BeginGroup], (instregex "LCGFR$")>;
				def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXa, Lat2], (instregex "LP(G)?R$")>;

				// Load on condition
				def : InstRW<[FXa, LSU, Lat6], (instregex "LOC(G)?$")>;
				def : InstRW<[FXa, Lat2], (instregex "LOC(G)?R$")>;
				def : InstRW<[FXa, Lat2], (instregex "LOC(G)?HI$")>;

				// Stores
				def : InstRW<[FXb, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "ST(X\|128)$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXb, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				// Store on condition / CondStore pseudos
				def : InstRW<[FXb, LSU, Lat5], (instregex "STOC(G)?$")>;
				def : InstRW<[FXa], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStore8Mux(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXa], (instregex "CondStoreF64(Inv)?$")>;

				// Store multiple (estimated average of ceil(5/2) FXb ops)
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, Lat10,
				GroupAlone], (instregex "STM(G\|H\|Y)?$")>;

				// Store real address
				def : InstRW<[FXb, LSU, Lat5], (instregex "STRAG$")>;

				// Select pseudo
				def : InstRW<[FXa], (instregex "Select(32\|64\|F32\|F64\|F128\|32Mux)$")>;

				// String instructions
				def : InstRW<[FXa, LSU, Lat30], (instregex "SRST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				///// FLOATING POINT

				// Addition
				def : InstRW<[VecBF, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[VecBF], (instregex "A(E\|D)BR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[VecBF, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[VecBF], (instregex "S(E\|D)BR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[VecBF, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[VecBF], (instregex "M(D\|DE\|EE)BR$")>;
				def : InstRW<[VecBF, VecBF, LSU, Lat12, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[VecBF, VecBF, Lat9, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[VecBF, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[VecFPd, LSU], (instregex "D(E\|D)B$")>;
				def : InstRW<[VecFPd], (instregex "D(E\|D)BR$")>;
				def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "DXBR$")>;

				// Square root
				def : InstRW<[VecFPd, LSU], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[VecFPd], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "SQXBR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat12, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat12, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, GroupAlone], (instregex "CLFEBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLFDBR$")>;
				def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "CL(F\|G)XBR$")>;

				// Copy sign
				def : InstRW<[VecXsPm], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[VecXsPm], (instregex "CPSDRs(d\|s)$")>;

				// Compare
				def : InstRW<[VecXsPm, LSU, Lat8], (instregex "C(E\|D)B$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "C(E\|D)BR?$")>;
				def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "CXBR$")>;

				// Load and Test
				def : InstRW<[VecXsPm, Lat4], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Load
				def : InstRW<[VecXsPm, LSU, Lat7], (instregex "LE(Y)?$")>;
				def : InstRW<[VecXsPm], (instregex "LER$")>;
				def : InstRW<[FXb], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load zero
				def : InstRW<[FXb], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXb, FXb, Lat2, BeginGroup], (instregex "LZXR$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[VecXsPm, Lat4], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXb], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXb], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXb], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Load lengthened
				def : InstRW<[VecBF, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[VecBF], (instregex "LDEBR$")>;
				def : InstRW<[VecBF, VecBF, LSU, Lat12 , GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[VecBF, VecBF, Lat9 , GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Load rounded
				def : InstRW<[VecBF], (instregex "LEDBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat20], (instregex "LDXBR(A)?$")>;

				// Load FP integer
				def : InstRW<[VecBF], (instregex "FIEBR(A)?$")>;
				def : InstRW<[VecBF], (instregex "FIDBR(A)?$")>;
				def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "FIXBR(A)?$")>;

				// Store
				def : InstRW<[FXb, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXb, LSU, Lat7], (instregex "STE(Y)?$")>;

				// Test Data Class
				def : InstRW<[LSU, VecXsPm, Lat9], (instregex "TC(E\|D)B$")>;
				def : InstRW<[LSU, VecDF, VecDF, Lat15, GroupAlone], (instregex "TCXB$")>;

				///// VECTOR

				// Various
				def : InstRW<[VecXsPm], (instregex "VA(B\|F\|G\|H\|Q\|CQ)$")>;
				uweigandUnsubmitted Done Reply Inline Actions It would be nice to at least separate out vector floating-point instructions, so we can easily see where W variants are needed. uweigand: It would be nice to at least separate out vector floating-point instructions, so we can easily…
				def : InstRW<[VecXsPm], (instregex "VACC(B\|F\|G\|H\|Q\|CQ)$")>;
				def : InstRW<[VecXsPm], (instregex "VAVG(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VAVGL(B\|F\|G\|H)$")>;
				def : InstRW<[VecBF], (instregex "VCD(GB\|LGB)$")>;
				def : InstRW<[VecBF], (instregex "WCD(GB\|LGB)$")>;
				def : InstRW<[VecXsPm], (instregex "VCEQ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCEQ(B\|F\|G\|H)S$")>;
				def : InstRW<[VecBF], (instregex "VCGDB$")>;
				def : InstRW<[VecBF], (instregex "WCGDB$")>;
				def : InstRW<[VecXsPm], (instregex "VCH(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCH(B\|F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VCHL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VCHL(B\|F\|G\|H)S$")>;
				def : InstRW<[VecMul], (instregex "VCKSM$")>;
				def : InstRW<[VecBF], (instregex "VCLGDB$")>;
				def : InstRW<[VecBF], (instregex "WCLGDB$")>;
				def : InstRW<[VecXsPm], (instregex "VCLZ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VCTZ(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VEC(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VECL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VERIM(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VERLL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VERLLV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESLV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRA(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRAV(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VESRLV(B\|F\|G\|H)$")>;
				def : InstRW<[VecStr], (instregex "VFAEB$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAEBS$")>;
				def : InstRW<[VecBF], (instregex "VFADB$")>;
				def : InstRW<[VecStr], (instregex "VFAE(F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAE(F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VFAEZ(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFAEZ(B\|F\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VFC(E\|H\|HE)DB$")>;
				def : InstRW<[VecXsPm], (instregex "WFC(E\|H\|HE)DB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VFC(E\|H\|HE)DBS$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WFC(E\|H\|HE)DBS$")>;
				def : InstRW<[VecStr], (instregex "VFEE(B\|F\|H\|ZB\|ZF\|ZH)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFEE(B\|F\|H\|ZB\|ZF\|ZH)S$")>;
				def : InstRW<[VecStr], (instregex "VFENE(B\|F\|H\|ZB\|ZF\|ZH)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VFENE(B\|F\|H\|ZB\|ZF\|ZH)S$")>;
				def : InstRW<[VecBF], (instregex "VF(I\|M\|S)DB$")>;
				def : InstRW<[VecXsPm], (instregex "VFL(C\|N\|P)DB$")>;
				def : InstRW<[VecXsPm], (instregex "WFL(C\|N\|P)DB$")>;
				def : InstRW<[VecBF], (instregex "VFM(A\|S)DB$")>;
				def : InstRW<[VecBF], (instregex "WFM(A\|S)DB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VFTCIDB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WFTCIDB$")>;
				def : InstRW<[VecXsPm], (instregex "VGBM$")>;
				def : InstRW<[VecMul], (instregex "VGFMA(B\|F\|G\|H)$")>;
				def : InstRW<[VecMul], (instregex "VGFM(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VGM(B\|F\|G\|H)$")>;
				def : InstRW<[VecStr], (instregex "VISTR(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VISTR(B\|F\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VLC(B\|F\|G\|H)$")>;
				def : InstRW<[VecBF], (instregex "VL(DE\|ED)B$")>;
				def : InstRW<[VecBF], (instregex "WL(DE\|ED)B$")>;
				def : InstRW<[VecXsPm, LSU, Lat7], (instregex "VLE(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VLEI(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VLP(B\|F\|G\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAH(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAL(B\|F)$")>;
				def : InstRW<[VecMul], (instregex "VMALE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMALH(B\|F\|H\|W)$")>;
				def : InstRW<[VecMul], (instregex "VMALO(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMAO(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VME(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMH(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VML(B\|F)$")>;
				def : InstRW<[VecMul], (instregex "VMLE(B\|F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMLH(B\|F\|H\|W)$")>;
				def : InstRW<[VecMul], (instregex "VMLO(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMN(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMNL(B\|F\|G\|H)$")>;
				def : InstRW<[VecMul], (instregex "VMO(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMRH(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMRL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMX(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VMXL(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VN(C\|O)?$")>;
				def : InstRW<[VecXsPm], (instregex "VO(NE)?$")>;
				def : InstRW<[VecXsPm], (instregex "VPDI$")>;
				def : InstRW<[VecXsPm], (instregex "VPERM$")>;
				def : InstRW<[VecXsPm], (instregex "VPK(F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VPKLS(F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VPKLS(F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VPKS(F\|G\|H)$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "VPKS(F\|G\|H)S$")>;
				def : InstRW<[VecXsPm], (instregex "VPOPCT$")>;
				def : InstRW<[VecXsPm], (instregex "VREP(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VREPI(B\|F\|G\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VSB(IQ\|CBIQ)?$")>;
				def : InstRW<[VecXsPm], (instregex "VSCBI(B\|F\|G\|H\|Q)$")>;
				def : InstRW<[VecXsPm], (instregex "VSEG(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VS(F\|G\|H\|Q\|EL)$")>;
				def : InstRW<[VecXsPm], (instregex "VSL(DB)?$")>;
				def : InstRW<[VecXsPm], (instregex "VSR(A\|L)$")>;
				def : InstRW<[VecStr], (instregex "VSTRC(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VSTRC(B\|F\|H)S$")>;
				def : InstRW<[VecStr], (instregex "VSTRCZ(B\|F\|H)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VSTRCZ(B\|F\|H)S$")>;
				def : InstRW<[VecMul], (instregex "VSUM(B\|H)$")>;
				def : InstRW<[VecMul], (instregex "VSUMG(F\|H)$")>;
				def : InstRW<[VecMul], (instregex "VSUMQ(F\|G)$")>;
				def : InstRW<[VecStr, Lat5], (instregex "VTM$")>;
				def : InstRW<[VecXsPm], (instregex "VUPH(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPL(B\|F)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPLH(B\|F\|H\|W)$")>;
				def : InstRW<[VecXsPm], (instregex "VUPLL(B\|F\|H)$")>;
				def : InstRW<[VecXsPm], (instregex "VX$")>;
				def : InstRW<[VecXsPm], (instregex "VZERO$")>;
				def : InstRW<[VecBF], (instregex "WF(A\|I\|M\|S)DB$")>;
				def : InstRW<[VecXsPm, Lat4], (instregex "WF(C\|K)DB$")>;

				// Vector divide / square root
				def : InstRW<[VecFPd], (instregex "(V\|W)FDDB$")>;
				def : InstRW<[VecFPd], (instregex "(V\|W)FSQDB$")>;

				// Moving between GPR and FPR
				def : InstRW<[FXb], (instregex "VLVG(B\|F\|G\|H)$")>;
				def : InstRW<[FXb], (instregex "LEFR$")>; // Printed as VLVGF
				def : InstRW<[FXb, Lat4], (instregex "VLGV(B\|F\|G\|H)$")>;
				def : InstRW<[FXb, Lat4], (instregex "LFER$")>; // Printed as VLGVF
				def : InstRW<[FXb, Lat2], (instregex "VLVGP(32)?$")>;

				// Load
				def : InstRW<[LSU], (instregex "VL(L\|BB)?$")>;
				def : InstRW<[LSU], (instregex "VL(32\|64)$")>;
				def : InstRW<[LSU], (instregex "VLLEZ(B\|F\|G\|H)$")>;
				def : InstRW<[LSU], (instregex "VLREP(B\|F\|G\|H)$")>;
				def : InstRW<[FXb], (instregex "VLR(32\|64)?$")>;

				// Store
				def : InstRW<[FXb, LSU, Lat8], (instregex "VST(L\|32\|64)?$")>;
				def : InstRW<[FXb, LSU, Lat8], (instregex "VSTE(F\|G)$")>;
				def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VSTE(B\|H)$")>;

				// Load / store multiple
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "VLM$")>;
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat20, GroupAlone],
				(instregex "VSTM$")>;

				// Byte instructions
				def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSLB$")>;
				def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSR(A\|L)B$")>;

				// Gather / scatter
				def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VGE(F\|G)$")>;
				def : InstRW<[FXb, FXb, LSU, Lat12, BeginGroup], (instregex "VSCE(F\|G)$")>;

				///// INLINE ASSEMBLY

				uweigandUnsubmitted Done Reply Inline Actions I don't think there's a real difference between those and the ones listed under Other. uweigand: I don't think there's a real difference between those and the ones listed under Other.
				def : InstRW<[LSU, LSU, LSU, FXa, FXa, FXb, Lat9, GroupAlone],
				(instregex "STCK(F)?$")>;
				def : InstRW<[LSU, LSU, LSU, LSU, FXa, FXa, FXb, FXb, Lat11, GroupAlone],
				(instregex "STCKE$")>;
				def : InstRW<[FXa, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXb, Lat30], (instregex "SVC$")>;

				///// OTHER

				// Extract Transaction Nesting Depth
				def : InstRW<[FXa], (instregex "ETND$")>;

				// Transaction begin
				def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat15, GroupAlone],
				(instregex "TBEGIN(C\|_nofloat)?$")>;

				// Transaction end
				def : InstRW<[FXb, GroupAlone], (instregex "TEND$")>;

				// Transaction abort
				def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>;

				// Load the Global Offset Table address
				def : InstRW<[FXa], (instregex "GOT$")>;

				uweigandUnsubmitted Done Reply Inline Actions This is just an alias for a LARL and should go there. uweigand: This is just an alias for a LARL and should go there.
				// Prefetch data
				def : InstRW<[LSU], (instregex "PFD(RL)?$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Insert Program Mask
				def : InstRW<[FXa, Lat3, EndGroup], (instregex "IPM$")>;

				// Processor assist
				def : InstRW<[FXb], (instregex "PPA$")>;

				// Extract CPU Time
				def : InstRW<[FXa, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[FXb, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXb, Lat30], (instregex "PR$")>;

				}

lib/Target/SystemZ/SystemZScheduleZ196.td

This file was added.

				//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----- tblgen --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for Z196 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def Z196Model : SchedMachineModel {

				let IssueWidth = 3; // 3 instructions decoded per cycle.
				let MicroOpBufferSize = 40; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;

				// This model does not include operand specific information.
				let CompleteModel = 0;
				}

				let SchedModel = Z196Model in {

				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				let NumMicroOps = 0;
				let EndGroup = 1;
				}
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def Z196_FXUnit : ProcResource<1>;
				def Z196_LSUnit : ProcResource<1>;
				def Z196_FPUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXU, [Z196_FXUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [Z196_LSUnit]> { let Latency = 4; }
				def : WriteRes<LSU_lat1, [Z196_LSUnit]> { let Latency = 1; }
				def : WriteRes<FPU, [ZEC12_FPUnit]> { let Latency = 8; }
				uweigandUnsubmitted Done Reply Inline Actions What's the EC12 doing here? uweigand: What's the EC12 doing here?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Good heavens! jonpa: Good heavens!

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				// Call
				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallB(C)?R$")>;
				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallBRCL$")>;
				def : InstRW<[LSU, EndGroup], (instregex "CallJG$")>;

				// Return
				def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>;
				def : InstRW<[LSU_lat1, EndGroup], (instregex "CondReturn$")>;

				// Branch
				def : InstRW<[LSU, EndGroup], (instregex "B(C)?R$")>;
				def : InstRW<[LSU, EndGroup], (instregex "BRC(L)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BRCT(G)?$")>;
				def : InstRW<[LSU, EndGroup], (instregex "J(G)?$")>;
				def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Compare and branch
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "C(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CG(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CL(I\|R)J$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLG(I\|R)J$")>;
				def : InstRW<[FXU, LSU, GroupAlone], (instregex "CG(R\|I)J$")>;
				def : InstRW<[FXU, LSU, GroupAlone], (instregex "C(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "CL(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CRB(Call\|Return)?$")>;

				// Serialize
				def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>;

				// Trap instructions
				def : InstRW<[LSU, EndGroup], (instregex "(Cond)?Trap$")>;

				///// FIXED POINT

				// Addition
				def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y\|SI)?$")>;
				def : InstRW<[FXU], (instregex "AIH$")>;
				def : InstRW<[FXU], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>;
				def : InstRW<[FXU], (instregex "AGFI$")>;
				def : InstRW<[FXU], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AGR(K)?$")>;
				def : InstRW<[FXU], (instregex "AHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXU], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXU], (instregex "ALGHSIK$")>;
				def : InstRW<[FXU], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXU], (instregex "ALR(K)?$")>;
				def : InstRW<[FXU], (instregex "AR(K)?$")>;

				// Logical addition with carry
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AGF$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "AGFR$")>;

				// Add halfword
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AH(Y)?$")>;

				// Subtraction
				def : InstRW<[FXU, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLFI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLR(K)?$")>;
				def : InstRW<[FXU], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SGF$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "SGFR$")>;

				// Subtract halfword
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SH(Y)?$")>;

				// Multiply
				def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXU, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXU, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXU, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXU, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXU, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>;

				// Divide
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?$")>;
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?$")>;

				// And
				def : InstRW<[FXU, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "NGR(K)?$")>;
				def : InstRW<[FXU], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXU], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "NILF(64)?$")>;
				def : InstRW<[FXU], (instregex "NILH(64)?$")>;
				def : InstRW<[FXU], (instregex "NILL(64)?$")>;
				def : InstRW<[FXU], (instregex "NR(K)?$")>;

				// Or
				def : InstRW<[FXU, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "OGR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXU], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "OILF(64)?$")>;
				def : InstRW<[FXU], (instregex "OILH(64)?$")>;
				def : InstRW<[FXU], (instregex "OILL(64)?$")>;
				def : InstRW<[FXU], (instregex "OR(K)?$")>;

				// Xor
				def : InstRW<[FXU, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXU], (instregex "XIFMux$")>;
				def : InstRW<[FXU], (instregex "XGR(K)?$")>;
				def : InstRW<[FXU], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "XILF(64)?$")>;
				def : InstRW<[FXU], (instregex "XR(K)?$")>;

				// Insert
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXU], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "IILF(64)?$")>;
				def : InstRW<[FXU], (instregex "IILH(64)?$")>;
				def : InstRW<[FXU], (instregex "IILL(64)?$")>;

				// And / Or / Xor character
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "(N\|O\|X)C$")>;

				// Shifts
				def : InstRW<[FXU], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXU], (instregex "RISBG(32)?$")>;
				def : InstRW<[FXU], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N\|O\|X)SBG$")>;

				// Extend
				def : InstRW<[FXU], (instregex "AEXT128_64$")>;
				def : InstRW<[FXU], (instregex "ZEXT128_(32\|64)$")>;

				// Find leftmost one
				def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>;

				// Compare
				def : InstRW<[FXU, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXU], (instregex "C(G)?R$")>;
				def : InstRW<[FXU], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXU], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "CLGR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXU], (instregex "CLIH$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXU, FXU, LSU, Lat6, Lat2, GroupAlone], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "CGFR$")>;

				// Compare and swap
				def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				// Compare logical character
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "CLC$")>;

				// Compare and trap
				def : InstRW<[FXU], (instregex "C(G)?IT$")>;
				def : InstRW<[FXU], (instregex "C(G)?RT$")>;
				def : InstRW<[FXU], (instregex "CLG(I\|R)T$")>;
				def : InstRW<[FXU], (instregex "CLFIT$")>;
				def : InstRW<[FXU], (instregex "CLRT$")>;

				// Test under mask
				def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXU], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLL(64)?$")>;

				// Load and test
				def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G\|GF)?$")>;
				def : InstRW<[FXU], (instregex "LT(G\|GF)?R$")>;

				// Moves
				def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Move with key
				def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Pseudo -> reg move
				def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>;

				// Loads (LSU)
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|F\|H\|FRL\|HRL)$")>;
				def : InstRW<[LSU], (instregex "LLH(RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "L(X\|128)$")>;

				// Loads (FXU)
				def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLG(C\|F\|H)R$")>;
				def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LLIL(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LA(Y\|RL)?$")>;
				def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXU], (instregex "L(B\|G)R$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LGB$")>;
				def : InstRW<[FXU], (instregex "LGBR$")>;
				def : InstRW<[FXU], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(F\|H)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(F\|H)RL$")>;
				def : InstRW<[FXU], (instregex "LG(F\|H)R$")>;
				def : InstRW<[FXU], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "LHR$")>;
				def : InstRW<[FXU], (instregex "LR(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXU], (instregex "LRV(G)?R$")>;

				// Load GR from FPR
				def : InstRW<[FXU, Lat3], (instregex "LGDR$")>;

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[FXU], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXU, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>;
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>;

				// Load on condition
				def : InstRW<[FXU, LSU, Lat6, EndGroup], (instregex "LOC(G)?$")>;
				def : InstRW<[FXU, Lat2, EndGroup], (instregex "LOC(G)?R$")>;

				// Stores
				def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(X\|128)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				// Store on condition / CondStore pseudos
				def : InstRW<[FXU, LSU, Lat5, EndGroup], (instregex "STOC(G)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>;

				// Store multiple (estimated average of 3 ops)
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone],
				(instregex "STM(H\|Y\|G)?$")>;

				// Store real address
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>;

				// Select pseudo
				def : InstRW<[FXU], (instregex "Select(32\|64\|F32\|F64\|F128\|32Mux)$")>;

				// String instructions
				def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				///// FLOATING POINT

				// Addition
				def : InstRW<[FPU, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "A(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[FPU, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "S(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[FPU, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[FPU], (instregex "M(D\|DE\|EE)BR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[FPU, LSU, Lat30], (instregex "D(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "D(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "DXBR$")>;

				// Square root
				def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "SQXBR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F\|G)XBR$")>;

				// Copy sign
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d\|s)$")>;

				// Compare
				def : InstRW<[FPU, LSU, Lat12], (instregex "C(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "C(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>;

				// Load and Test
				def : InstRW<[FPU], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Load
				def : InstRW<[LSU], (instregex "LE(Y)?$")>;
				def : InstRW<[FXU], (instregex "LER$")>;
				def : InstRW<[FXU], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load zero
				def : InstRW<[FXU], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[FPU], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[FPU], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Load lengthened
				def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[FPU], (instregex "LDEBR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Load rounded
				def : InstRW<[FPU], (instregex "LEDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>;

				// Load FP integer
				def : InstRW<[FPU], (instregex "FIEBR(A)?$")>;
				def : InstRW<[FPU], (instregex "FIDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>;

				// Store
				def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>;

				// Test Data Class
				def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E\|D)B$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "TCXB$")>;

				///// INLINE ASSEMBLY

				def : InstRW<[FXU, LSU, Lat15], (instregex "STCK$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "STCKF$")>;
				def : InstRW<[LSU, FXU, Lat5], (instregex "STCKE$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXU, Lat30], (instregex "SVC$")>;

				///// OTHER

				// Load the Global Offset Table address
				def : InstRW<[FXU], (instregex "GOT$")>;

				// Prefetch data
				def : InstRW<[LSU, GroupAlone], (instregex "PFD(RL)?$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Insert Program Mask
				def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>;

				// Extract CPU Time
				def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXU, Lat30], (instregex "PR$")>;
				}

lib/Target/SystemZ/SystemZScheduleZEC12.td

This file was added.

				//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----- tblgen --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the machine model for ZEC12 to support instruction
				// scheduling and other instruction cost heuristics.
				//
				//===----------------------------------------------------------------------===//

				def ZEC12Model : SchedMachineModel {

				let IssueWidth = 3; // 3 instructions decoded per cycle.
				let MicroOpBufferSize = 40; // Issue queues
				let LoadLatency = 1; // Optimistic load latency.

				let PostRAScheduler = 1;

				// Extra cycles for a mispredicted branch.
				let MispredictPenalty = 8;

				// This model does not include operand specific information.
				let CompleteModel = 0;
				}

				let SchedModel = ZEC12Model in {

				// These definitions could be put in a subtarget common include file,
				// but it seems the include system in Tablegen currently rejects
				// multiple includes of same file.
				def : WriteRes<GroupAlone, []> {
				let NumMicroOps = 0;
				let BeginGroup = 1;
				let EndGroup = 1;
				}
				def : WriteRes<EndGroup, []> {
				let NumMicroOps = 0;
				let EndGroup = 1;
				}
				def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;}
				def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;}
				def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;}
				def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;}
				def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;}
				def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;}
				def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;}
				def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;}
				def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;}
				def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;}
				def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;}
				def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;}
				def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;}
				def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;}

				// Execution units.
				def ZEC12_VBUnit : ProcResource<1>;
				def ZEC12_FXUnit : ProcResource<1>;
				def ZEC12_LSUnit : ProcResource<1>;
				def ZEC12_FPUnit : ProcResource<1>;

				// Subtarget specific definitions of scheduling resources.
				def : WriteRes<FXU, [ZEC12_FXUnit]> { let Latency = 1; }
				def : WriteRes<LSU, [ZEC12_LSUnit]> { let Latency = 4; }
				def : WriteRes<LSU_lat1, [ZEC12_LSUnit]> { let Latency = 1; }
				def : WriteRes<FPU, [ZEC12_FPUnit]> { let Latency = 8; }
				def : WriteRes<VBU, [ZEC12_VBUnit]>; // Virtual Branching Unit

				// -------------------------- INSTRUCTIONS ---------------------------------- //

				// InstRW constructs have been used in order to preserve the
				// readability of the InstrInfo files.

				// For each instruction, as matched by a regexp, provide a list of
				// resources that it needs. These will be combined into a SchedClass.

				// Call
				def : InstRW<[VBU, FXU, FXU, Lat3, GroupAlone], (instregex "BRAS$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BASR$")>;
				def : InstRW<[LSU, Lat4], (instregex "CallB(C)?R$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "TLS_(G\|L)DCALL$")>;
				def : InstRW<[VBU], (instregex "CallBRCL$")>;
				def : InstRW<[VBU], (instregex "CallJG$")>;

				// Return
				def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>;
				def : InstRW<[LSU_lat1], (instregex "CondReturn$")>;

				// Branch
				def : InstRW<[LSU, Lat4], (instregex "B(C)?R$")>;
				def : InstRW<[VBU], (instregex "BRC(L)?$")>;
				def : InstRW<[FXU, EndGroup], (instregex "BRCT(G)?$")>;
				def : InstRW<[VBU], (instregex "J(G)?$")>;
				def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], (instregex "BRX(H\|LE)$")>;

				// Compare and branch
				def : InstRW<[FXU], (instregex "C(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "CG(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "CL(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "CLG(I\|R)J$")>;
				def : InstRW<[FXU], (instregex "CG(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "C(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "CL(R\|I)J$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CGIB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLGRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLR(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CLRB(Call\|Return)?$")>;
				def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "CRB(Call\|Return)?$")>;

				// Serialize
				def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>;

				// Trap instructions
				def : InstRW<[VBU], (instregex "(Cond)?Trap$")>;

				///// FIXED POINT

				// Addition
				def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y\|SI)?$")>;
				def : InstRW<[FXU], (instregex "AIH$")>;
				def : InstRW<[FXU], (instregex "AFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>;
				def : InstRW<[FXU], (instregex "AGFI$")>;
				def : InstRW<[FXU], (instregex "AGHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AGR(K)?$")>;
				def : InstRW<[FXU], (instregex "AHI(K)?$")>;
				def : InstRW<[FXU], (instregex "AHIMux(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>;
				def : InstRW<[FXU], (instregex "AL(FI\|HSIK)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>;
				def : InstRW<[FXU], (instregex "ALGHSIK$")>;
				def : InstRW<[FXU], (instregex "ALGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "ALGR(K)?$")>;
				def : InstRW<[FXU], (instregex "ALR(K)?$")>;
				def : InstRW<[FXU], (instregex "AR(K)?$")>;

				// Logical addition with carry
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>;

				// Add with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "AGF$")>;
				def : InstRW<[FXU, Lat2], (instregex "AGFR$")>;

				// Add halfword
				def : InstRW<[FXU, LSU, Lat6], (instregex "AH(Y)?$")>;

				// Subtraction
				def : InstRW<[FXU, LSU, Lat5], (instregex "S(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLFI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G\|GF\|Y)?$")>;
				def : InstRW<[FXU], (instregex "SLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "SLGR(K)?$")>;
				def : InstRW<[FXU], (instregex "SLR(K)?$")>;
				def : InstRW<[FXU], (instregex "SR(K)?$")>;

				// Subtraction with borrow
				def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>;
				def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>;

				// Subtraction with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "SGF$")>;
				def : InstRW<[FXU, Lat2], (instregex "SGFR$")>;

				// Subtract halfword
				def : InstRW<[FXU, LSU, Lat6], (instregex "SH(Y)?$")>;

				// Multiply
				def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF\|Y)?$")>;
				def : InstRW<[FXU, Lat6], (instregex "MS(R\|FI)$")>;
				def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>;
				def : InstRW<[FXU, Lat8], (instregex "MSGR$")>;
				def : InstRW<[FXU, Lat6], (instregex "MSGF(I\|R)$")>;
				def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>;
				def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>;
				def : InstRW<[FXU, Lat5], (instregex "MGHI$")>;
				def : InstRW<[FXU, Lat5], (instregex "MHI$")>;
				def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>;

				// Divide
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DSG(F)?$")>;
				def : InstRW<[FPU, FPU, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?R$")>;
				def : InstRW<[FPU, FPU, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone],
				(instregex "DL(G)?$")>;

				// And
				def : InstRW<[FXU, LSU, Lat5], (instregex "NTSTG$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "N(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "NGR(K)?$")>;
				def : InstRW<[FXU], (instregex "NI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>;
				def : InstRW<[FXU], (instregex "NIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "NIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "NILF(64)?$")>;
				def : InstRW<[FXU], (instregex "NILH(64)?$")>;
				def : InstRW<[FXU], (instregex "NILL(64)?$")>;
				def : InstRW<[FXU], (instregex "NR(K)?$")>;

				// Or
				def : InstRW<[FXU, LSU, Lat5], (instregex "O(G\|Y)?$")>;
				def : InstRW<[FXU], (instregex "OGR(K)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>;
				def : InstRW<[FXU], (instregex "OI(FMux\|HMux\|LMux)$")>;
				def : InstRW<[FXU], (instregex "OIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "OIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "OILF(64)?$")>;
				def : InstRW<[FXU], (instregex "OILH(64)?$")>;
				def : InstRW<[FXU], (instregex "OILL(64)?$")>;
				def : InstRW<[FXU], (instregex "OR(K)?$")>;

				// Xor
				def : InstRW<[FXU, LSU, Lat5], (instregex "X(G\|Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>;
				def : InstRW<[FXU], (instregex "XIFMux$")>;
				def : InstRW<[FXU], (instregex "XGR(K)?$")>;
				def : InstRW<[FXU], (instregex "XIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "XILF(64)?$")>;
				def : InstRW<[FXU], (instregex "XR(K)?$")>;

				// Insert
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H\|Y)?$")>;
				def : InstRW<[FXU], (instregex "II(F\|H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "IIHF(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHH(64)?$")>;
				def : InstRW<[FXU], (instregex "IIHL(64)?$")>;
				def : InstRW<[FXU], (instregex "IILF(64)?$")>;
				def : InstRW<[FXU], (instregex "IILH(64)?$")>;
				def : InstRW<[FXU], (instregex "IILL(64)?$")>;

				// And / Or / Xor character
				def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "(N\|O\|X)C$")>;

				// Shifts
				def : InstRW<[FXU], (instregex "SLL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRL(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SRA(G\|K)?$")>;
				def : InstRW<[FXU], (instregex "SLA(K)?$")>;

				// Rotate
				def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>;

				// Rotate and insert
				def : InstRW<[FXU], (instregex "RISBG(N\|32)?$")>;
				def : InstRW<[FXU], (instregex "RISBH(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBL(G\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "RISBMux$")>;

				// Rotate and Select
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N\|O\|X)SBG$")>;

				// Extend
				def : InstRW<[FXU], (instregex "AEXT128_64$")>;
				def : InstRW<[FXU], (instregex "ZEXT128_(32\|64)$")>;

				// Find leftmost one
				def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>;

				// Population count
				def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>;

				// Compare
				def : InstRW<[FXU, LSU, Lat5], (instregex "C(G\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "CFI(Mux)?$")>;
				def : InstRW<[FXU], (instregex "CG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI\|RL)$")>;
				def : InstRW<[FXU], (instregex "C(G)?R$")>;
				def : InstRW<[FXU], (instregex "C(HI\|IH)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F\|SI)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y\|Mux\|FHSI)?$")>;
				def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL\|HSI)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>;
				def : InstRW<[FXU], (instregex "CLGF(I\|R)$")>;
				def : InstRW<[FXU], (instregex "CLGR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F\|RL\|HSI)$")>;
				def : InstRW<[FXU], (instregex "CLIH$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>;
				def : InstRW<[FXU], (instregex "CLR$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>;

				// Compare halfword
				def : InstRW<[FXU, LSU, Lat6], (instregex "CH(Y\|RL)?$")>;
				def : InstRW<[FXU, LSU, Lat6], (instregex "CGH(RL)?$")>;
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CHHSI$")>;

				// Compare with sign extension (32 -> 64)
				def : InstRW<[FXU, LSU, Lat6], (instregex "CGF(RL)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "CGFR$")>;

				// Compare and swap
				def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CS(G\|Y)?$")>;

				// Compare logical character
				def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "CLC$")>;

				// Compare and trap
				def : InstRW<[FXU], (instregex "C(G)?IT$")>;
				def : InstRW<[FXU], (instregex "C(G)?RT$")>;
				def : InstRW<[FXU], (instregex "CLG(I\|R)T$")>;
				def : InstRW<[FXU], (instregex "CLFIT$")>;
				def : InstRW<[FXU], (instregex "CLRT$")>;

				// Test under mask
				def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>;
				def : InstRW<[FXU], (instregex "TM(H\|L)Mux$")>;
				def : InstRW<[FXU], (instregex "TMHH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMHL(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLH(64)?$")>;
				def : InstRW<[FXU], (instregex "TMLL(64)?$")>;

				// Load and test
				def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G\|GF)?$")>;
				def : InstRW<[FXU], (instregex "LT(G\|GF)?R$")>;

				// Moves
				def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G\|H)?HI$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>;

				// Move character
				def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>;

				// Move with key
				def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>;

				// Pseudo -> reg move
				def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>;
				def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>;
				def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>;
				def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>;

				// Loads (LSU)
				def : InstRW<[LSU], (instregex "L(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "LD(Y\|E32)?$")>;
				def : InstRW<[LSU], (instregex "LG(RL)?$")>;
				def : InstRW<[LSU], (instregex "LLC(Mux)?$")>;
				def : InstRW<[LSU], (instregex "LLG(C\|F\|H\|FRL\|HRL)$")>;
				def : InstRW<[LSU], (instregex "LLH(RL\|Mux)?$")>;
				def : InstRW<[LSU], (instregex "L(X\|128)$")>;

				// Loads (FXU)
				def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C\|H)H$")>;
				def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLG(C\|F\|H)R$")>;
				def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>;
				def : InstRW<[FXU], (instregex "LLIH(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LLIL(F\|H\|L)$")>;
				def : InstRW<[FXU], (instregex "LA(Y\|RL)?$")>;
				def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H\|Mux)?$")>;
				def : InstRW<[FXU], (instregex "L(B\|G)R$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LGB$")>;
				def : InstRW<[FXU], (instregex "LGBR$")>;
				def : InstRW<[FXU], (instregex "LG(F\|H)I$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(F\|H)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LG(F\|H)RL$")>;
				def : InstRW<[FXU], (instregex "LG(F\|H)R$")>;
				def : InstRW<[FXU], (instregex "LHI(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H\|Y\|Mux\|RL)?$")>;
				def : InstRW<[FXU], (instregex "LHR$")>;
				def : InstRW<[FXU], (instregex "LR(Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G\|H)?$")>;
				def : InstRW<[FXU], (instregex "LRV(G)?R$")>;

				// Load GR from FPR
				def : InstRW<[FXU, Lat3], (instregex "LGDR$")>;

				// Load multiple (estimated average of 5 ops)
				def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone],
				(instregex "LM(H\|Y\|G)?$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[FXU], (instregex "LC(R\|GR)$")>;
				def : InstRW<[FXU, Lat2], (instregex "LN(R\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>;
				def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N\|P)GFR$")>;
				def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>;

				// Load on condition
				def : InstRW<[FXU, LSU, Lat6], (instregex "LOC(G)?$")>;
				def : InstRW<[FXU, Lat2], (instregex "LOC(G)?R$")>;

				// Stores
				def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(X\|128)$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H\|Y\|RL\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y\|FH\|RL\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H\|Y\|Mux)?$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G\|H)?$")>;

				// Store on condition / CondStore pseudos
				def : InstRW<[FXU, LSU, Lat5], (instregex "STOC(G)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>;
				def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>;

				// Store multiple (estimated average of 3 ops)
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone],
				(instregex "STM(H\|Y\|G)?$")>;

				// Store real address
				def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>;

				// Select pseudo
				def : InstRW<[FXU], (instregex "Select(32\|64\|F32\|F64\|F128\|32Mux)$")>;

				// String instructions
				def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>;
				def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>;

				///// FLOATING POINT

				// Addition
				def : InstRW<[FPU, LSU, Lat12], (instregex "A(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "A(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "AXBR$")>;

				// Subtraction
				def : InstRW<[FPU, LSU, Lat12], (instregex "S(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "S(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat20, GroupAlone], (instregex "SXBR$")>;

				// Multiply
				def : InstRW<[FPU, LSU, Lat12], (instregex "M(D\|DE\|EE)B$")>;
				def : InstRW<[FPU], (instregex "M(D\|DE\|EE)BR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "MXDB$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "MXDBR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "MXBR$")>;

				// Multiply and add / subtract
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)EB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)EBR$")>;
				def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A\|S)DB$")>;
				def : InstRW<[FPU, GroupAlone], (instregex "M(A\|S)DBR$")>;

				// Division
				def : InstRW<[FPU, LSU, Lat30], (instregex "D(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "D(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "DXBR$")>;

				// Square root
				def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E\|D)B$")>;
				def : InstRW<[FPU, Lat30], (instregex "SQ(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30, GroupAlone], (instregex "SQXBR$")>;

				// Convert from fixed / logical
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CX(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F\|G)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat11, GroupAlone], (instregex "CXL(F\|G)BR$")>;

				// Convert to fixed / logical
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F\|G)XBR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E\|D)BR$")>;
				def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F\|G)XBR$")>;

				// Copy sign
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d\|s)$")>;
				def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d\|s)$")>;

				// Compare
				def : InstRW<[FPU, LSU, Lat12], (instregex "C(E\|D)B$")>;
				def : InstRW<[FPU], (instregex "C(E\|D)BR$")>;
				def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>;

				// Load and Test
				def : InstRW<[FPU], (instregex "LT(D\|E)BR$")>;
				def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "LTXBR$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone],
				(instregex "LTXBRCompare(_VecPseudo)?$")>;

				// Load
				def : InstRW<[LSU], (instregex "LE(Y)?$")>;
				def : InstRW<[FXU], (instregex "LER$")>;
				def : InstRW<[FXU], (instregex "LD(R\|R32\|GR)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>;

				// Load zero
				def : InstRW<[FXU], (instregex "LZ(DR\|ER)$")>;
				def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>;

				// Load Complement / Negative / Positive
				def : InstRW<[FPU], (instregex "L(C\|N\|P)DBR$")>;
				def : InstRW<[FPU], (instregex "L(C\|N\|P)EBR$")>;
				def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>;
				def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>;
				def : InstRW<[FPU, FPU, Lat9, GroupAlone], (instregex "L(C\|N\|P)XBR$")>;

				// Load lengthened
				def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>;
				def : InstRW<[FPU], (instregex "LDEBR$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "LX(D\|E)B$")>;
				def : InstRW<[FPU, FPU, Lat10, GroupAlone], (instregex "LX(D\|E)BR$")>;

				// Load rounded
				def : InstRW<[FPU], (instregex "LEDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>;

				// Load FP integer
				def : InstRW<[FPU], (instregex "FIEBR(A)?$")>;
				def : InstRW<[FPU], (instregex "FIDBR(A)?$")>;
				def : InstRW<[FPU, FPU, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>;

				// Store
				def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>;
				def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>;

				// Test Data Class
				def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E\|D)B$")>;
				def : InstRW<[FPU, FPU, LSU, Lat15, GroupAlone], (instregex "TCXB$")>;

				///// INLINE ASSEMBLY

				def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "STCK(F)?$")>;
				def : InstRW<[LSU, LSU, LSU, LSU, FXU, FXU, Lat20, GroupAlone],
				(instregex "STCKE$")>;
				def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>;
				def : InstRW<[FXU, Lat30], (instregex "SVC$")>;

				///// OTHER

				// Extract Transaction Nesting Depth
				def : InstRW<[FXU], (instregex "ETND$")>;

				// Transaction begin
				def : InstRW<[LSU, LSU, FXU, FXU, FXU, FXU, FXU, Lat15, GroupAlone],
				(instregex "TBEGIN(C\|_nofloat)?$")>;

				// Transaction end
				def : InstRW<[LSU, GroupAlone], (instregex "TEND$")>;

				// Transaction abort
				def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>;

				// Load the Global Offset Table address
				def : InstRW<[FXU], (instregex "GOT$")>;

				// Prefetch data
				def : InstRW<[LSU], (instregex "PFD(RL)?$")>;

				// Extract access register
				def : InstRW<[LSU], (instregex "EAR$")>;

				// Insert Program Mask
				def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>;

				// Processor assist
				def : InstRW<[FXU], (instregex "PPA$")>;

				// Extract CPU Time
				def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>;

				// Execute
				def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>;

				// Program return
				def : InstRW<[FXU, Lat30], (instregex "PR$")>;
				}

lib/Target/SystemZ/SystemZTargetMachine.cpp

//===-- SystemZTargetMachine.cpp - Define TargetMachine for SystemZ -------===//		//===-- SystemZTargetMachine.cpp - Define TargetMachine for SystemZ -------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZTargetMachine.h"		#include "SystemZTargetMachine.h"
#include "SystemZTargetTransformInfo.h"		#include "SystemZTargetTransformInfo.h"
		#include "SystemZMachineScheduler.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/CodeGen/TargetPassConfig.h"		#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"		#include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"

using namespace llvm;		using namespace llvm;

▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
public:		public:
SystemZPassConfig(SystemZTargetMachine *TM, PassManagerBase &PM)		SystemZPassConfig(SystemZTargetMachine *TM, PassManagerBase &PM)
: TargetPassConfig(TM, PM) {}		: TargetPassConfig(TM, PM) {}

SystemZTargetMachine &getSystemZTargetMachine() const {		SystemZTargetMachine &getSystemZTargetMachine() const {
return getTM<SystemZTargetMachine>();		return getTM<SystemZTargetMachine>();
}		}

		ScheduleDAGInstrs *
		createPostMachineScheduler(MachineSchedContext *C) const override {
		return new ScheduleDAGMI(C, make_unique<SystemZPostRASchedStrategy>(C),
		/IsPostRA=/true);
		}

void addIRPasses() override;		void addIRPasses() override;
bool addInstSelector() override;		bool addInstSelector() override;
void addPreSched2() override;		void addPreSched2() override;
void addPreEmitPass() override;		void addPreEmitPass() override;
};		};
} // end anonymous namespace		} // end anonymous namespace

void SystemZPassConfig::addIRPasses() {		void SystemZPassConfig::addIRPasses() {
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	void SystemZPassConfig::addPreEmitPass() {
// preventing that would be a win or not.		// preventing that would be a win or not.
if (getOptLevel() != CodeGenOpt::None)		if (getOptLevel() != CodeGenOpt::None)
addPass(createSystemZElimComparePass(getSystemZTargetMachine()), false);		addPass(createSystemZElimComparePass(getSystemZTargetMachine()), false);
addPass(createSystemZLongBranchPass(getSystemZTargetMachine()));		addPass(createSystemZLongBranchPass(getSystemZTargetMachine()));

// Do final scheduling after all other optimizations, to get an		// Do final scheduling after all other optimizations, to get an
// optimal input for the decoder (branch relaxation must happen		// optimal input for the decoder (branch relaxation must happen
// after block placement).		// after block placement).
if (getOptLevel() != CodeGenOpt::None) {		if (getOptLevel() != CodeGenOpt::None)
if (MISchedPostRA)
addPass(&PostMachineSchedulerID);		addPass(&PostMachineSchedulerID);
else
addPass(&PostRASchedulerID);
}
}		}

TargetPassConfig *SystemZTargetMachine::createPassConfig(PassManagerBase &PM) {		TargetPassConfig *SystemZTargetMachine::createPassConfig(PassManagerBase &PM) {
return new SystemZPassConfig(this, PM);		return new SystemZPassConfig(this, PM);
}		}

TargetIRAnalysis SystemZTargetMachine::getTargetIRAnalysis() {		TargetIRAnalysis SystemZTargetMachine::getTargetIRAnalysis() {
return TargetIRAnalysis([this](const Function &F) {		return TargetIRAnalysis([this](const Function &F) {
return TargetTransformInfo(SystemZTTIImpl(this, F));		return TargetTransformInfo(SystemZTTIImpl(this, F));
});		});
}		}

test/CodeGen/SystemZ/vec-args-06.ll

	Show All 36 Lines

	; More than eight vector return values use sret.			; More than eight vector return values use sret.
	define { <2 x double>, <2 x double>, <2 x double>, <2 x double>,			define { <2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double>, <2 x double>, <2 x double>, <2 x double>,			<2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double> } @f2() {			<2 x double> } @f2() {
	; CHECK-LABEL: f2:			; CHECK-LABEL: f2:
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 128(%r2)			; CHECK-DAG: vst [[VTMP]], 128(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 112(%r2)			; CHECK-DAG: vst [[VTMP]], 112(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 96(%r2)			; CHECK-DAG: vst [[VTMP]], 96(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 80(%r2)			; CHECK-DAG: vst [[VTMP]], 80(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 64(%r2)			; CHECK-DAG: vst [[VTMP]], 64(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 48(%r2)			; CHECK-DAG: vst [[VTMP]], 48(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 32(%r2)			; CHECK-DAG: vst [[VTMP]], 32(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 16(%r2)			; CHECK-DAG: vst [[VTMP]], 16(%r2)
	; CHECK: larl [[TMP:%r[0-5]]], .LCPI			; CHECK-DAG: larl [[TMP:%r[0-5]]], .LCPI
	; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])			; CHECK: vl [[VTMP:%v[0-9]+]], 0([[TMP]])
	; CHECK: vst [[VTMP]], 0(%r2)			; CHECK: vst [[VTMP]], 0(%r2)
	; CHECK: br %r14			; CHECK: br %r14
	ret { <2 x double>, <2 x double>, <2 x double>, <2 x double>,			ret { <2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double>, <2 x double>, <2 x double>, <2 x double>,			<2 x double>, <2 x double>, <2 x double>, <2 x double>,
	<2 x double> }			<2 x double> }
	{ <2 x double> <double 1.0, double 1.1>,			{ <2 x double> <double 1.0, double 1.1>,
	<2 x double> <double 2.0, double 2.1>,			<2 x double> <double 2.0, double 2.1>,
	<2 x double> <double 3.0, double 3.1>,			<2 x double> <double 3.0, double 3.1>,
	<2 x double> <double 4.0, double 4.1>,			<2 x double> <double 4.0, double 4.1>,
	<2 x double> <double 5.0, double 5.1>,			<2 x double> <double 5.0, double 5.1>,
	<2 x double> <double 6.0, double 6.1>,			<2 x double> <double 6.0, double 6.1>,
	<2 x double> <double 7.0, double 7.1>,			<2 x double> <double 7.0, double 7.1>,
	<2 x double> <double 8.0, double 8.1>,			<2 x double> <double 8.0, double 8.1>,
	<2 x double> <double 9.0, double 9.1> }			<2 x double> <double 9.0, double 9.1> }
	}			}

test/CodeGen/SystemZ/vec-perm-12.ll

	; Test inserting a truncated value into a vector element			; Test inserting a truncated value into a vector element
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \
	; RUN: FileCheck -check-prefix=CHECK-CODE %s			; RUN: FileCheck -check-prefix=CHECK-CODE %s
	; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \			; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 \| \
	; RUN: FileCheck -check-prefix=CHECK-VECTOR %s			; RUN: FileCheck -check-prefix=CHECK-VECTOR %s

	define <4 x i32> @f1(<4 x i32> %x, i64 %y) {			define <4 x i32> @f1(<4 x i32> %x, i64 %y) {
	; CHECK-CODE-LABEL: f1:			; CHECK-CODE-LABEL: f1:
	; CHECK-CODE: vlvgf [[ELT:%v[0-9]+]], %r2, 0			; CHECK-CODE-DAG: vlvgf [[ELT:%v[0-9]+]], %r2, 0
	; CHECK-CODE: larl [[REG:%r[0-5]]],			; CHECK-CODE-DAG: larl [[REG:%r[0-5]]],
	; CHECK-CODE: vl [[MASK:%v[0-9]+]], 0([[REG]])			; CHECK-CODE-DAG: vl [[MASK:%v[0-9]+]], 0([[REG]])
	; CHECK-CODE: vperm %v24, %v24, [[ELT]], [[MASK]]			; CHECK-CODE: vperm %v24, %v24, [[ELT]], [[MASK]]
	; CHECK-CODE: br %r14			; CHECK-CODE: br %r14

	; CHECK-VECTOR: .byte 12			; CHECK-VECTOR: .byte 12
	; CHECK-VECTOR-NEXT: .byte 13			; CHECK-VECTOR-NEXT: .byte 13
	; CHECK-VECTOR-NEXT: .byte 14			; CHECK-VECTOR-NEXT: .byte 14
	; CHECK-VECTOR-NEXT: .byte 15			; CHECK-VECTOR-NEXT: .byte 15
	; CHECK-VECTOR-NEXT: .byte 8			; CHECK-VECTOR-NEXT: .byte 8
	Show All 23 Lines

This is an archive of the discontinued LLVM Phabricator instance.

SystemZ scheduling implementationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 73919

include/llvm/CodeGen/ScheduleDAG.h

lib/CodeGen/ScheduleDAGInstrs.cpp

lib/Target/SystemZ/CMakeLists.txt

lib/Target/SystemZ/SystemZ.td

lib/Target/SystemZ/SystemZHazardRecognizer.h

lib/Target/SystemZ/SystemZHazardRecognizer.cpp

lib/Target/SystemZ/SystemZInstrInfo.h

lib/Target/SystemZ/SystemZInstrInfo.cpp

lib/Target/SystemZ/SystemZMachineScheduler.h

lib/Target/SystemZ/SystemZMachineScheduler.cpp

lib/Target/SystemZ/SystemZProcessors.td

lib/Target/SystemZ/SystemZSchedule.td

lib/Target/SystemZ/SystemZScheduleZ13.td

lib/Target/SystemZ/SystemZScheduleZ196.td

lib/Target/SystemZ/SystemZScheduleZEC12.td

lib/Target/SystemZ/SystemZTargetMachine.cpp

test/CodeGen/SystemZ/vec-args-06.ll

test/CodeGen/SystemZ/vec-perm-12.ll

SystemZ scheduling implementation
ClosedPublic