As a consequence of recent discussions (http://lists.llvm.org/pipermail/llvm-dev/2018-May/123164.html) this patch changes the SystemZ SchedModels so that the IssueWidth is 6, which is the decoder capacity, and NumMicroOps become the number of decoder slots needed per instruction (Note: The first diff is just for z13 and z14. z196 and zEC12 will be updated also after all seems good).
Making sure that SchedWrite latencies match the MachineInstructions def-operand indexes has also been addressed. Basically, I have added a WLat for each def-operand with the instruction latency.
ReadAdvances have as well been added on instructions with one register operand and one memory operand. The register operand is then needed later than the address registers, since the load from memory must be done first. Since all these instructions always have the register operand before the address operands, it is enough to simply insert one ReadAdvance in the list.
I have used an approach where I try to separate the different concerns (latencies, functional units and micro-ops/grouping). All instructions are one micro-op, except in cases with special grouping rules, so micro-ops and grouping belongs together. The InstrRW lists are patterned like:
[Def-operand latencies, use operand read advances, FUs, GroupingRule]
For example, Insert-Character has one register (first use operand) and one memory operand, produces its result after 1 cycle, uses the FXa and LSU units, and groups normally:
def : InstRW<[WLat1LSU, RegReadAdv, FXa, LSU, NormalGr], (instregex "IC(Y)?$")>;
I think this looks nice and simple. Does this look sound?
Some minor questions:
- When duplicating the latency for the CC operand, I have simply inserted another WLat entry. Is it possible and better to instead use some kind of list like WLatCC, to just expose one entry in the InstRW?
- I tried making loops for e.g.
def : WriteRes<LSU, [Z13_LSUnit]>; def : WriteRes<LSU2, [Z13_LSUnit]> { let ResourceCycles = [2]; } def : WriteRes<LSU3, [Z13_LSUnit]> { let ResourceCycles = [3]; } def : WriteRes<LSU4, [Z13_LSUnit]> { let ResourceCycles = [4]; } def : WriteRes<LSU5, [Z13_LSUnit]> { let ResourceCycles = [5]; }
, but did not find a way to do this (I think that it didn't work to express a LSU#I SchedWrite). I guess the current above will have to do?
- I would like to have a constant for the LSULatency a value to be used also by RegReadAdv. Right now I have it hard coded to '4' in both places. How is this done in TableGen?
You could rewrite it more compactly as :
foreach FPUSuffix = ["a2", "a3", "a4"] in {
}