This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Improve decoding in case of instructions with four register operands.
ClosedPublic

Authored by jonpa on Jul 26 2018, 5:16 AM.

Details

Reviewers
uweigand
Summary

Since z13, the max group size will be 2 if any μop has more than 3 register sources.

This has been ignored sofar in the SystemZHazardRecognizer, but is now handled by recognizing those instructions and adjusting the tracking of decoding and the cost heuristic for grouping.

  • If the current decoder group has 2 uops, and the candidate has 4RegOps, then a cost of 1 is returned from groupingCost().
  • Some of the relevant instructions (non-vector) group alone. The handling of these should be as before.
  • The absolute grouping cost is not adjusted in groupingCost() between 2 or 3 depending on CurrGroupHas4RegOps, since it should not matter.
  • If the current group has an instruction with 4 reg operands, the max group size becomes 2, except in the case of a single group-alone instruction which per default has 3 uops to represent a full decoder group. This also means that a cracked instruction that has 4 reg ops (if there is such) would group alone.
  • One very minor SystemZ test update.

Diff Detail

Event Timeline

jonpa created this revision.Jul 26 2018, 5:16 AM
uweigand added inline comments.Jul 26 2018, 10:19 AM
lib/Target/SystemZ/SystemZHazardRecognizer.cpp
123

Have you verified that this doesn't count things like CC?

jonpa marked an inline comment as done.Jul 27 2018, 4:28 AM

These are the opcodes that would return true from has4RegOps:

z196, zEC12:
Group alone:
LMD, MAD, MADB, MAE, MAEB, MAY, MAYH, MAYL, MSD, MSDB, MSE, MSEB, MVCK, MVCP, MVCS, MY, MYH, MYL, PLO, SLDT, SLXT, SRDT, SRXT

Z13, in addition:
Normal grouping:
VAC, VACCC, VACCCQ, VACQ, VFMA, VFMADB, VFMS, VFMSDB, VGFMA, VGFMAB, VGFMAF, VGFMAG, VGFMAH, VMAE, VMAEB, VMAEF, VMAEH, VMAH, VMAHB, VMAHF, VMAHH, VMAL, VMALB, VMALE, VMALEB, VMALEF, VMALEH, VMALF, VMALH, VMALHB, VMALHF, VMALHH, VMALHW, VMALO, VMALOB, VMALOF, VMALOH, VMAO, VMAOB, VMAOF, VMAOH, VPERM, VSBCBI, VSBCBIQ, VSBI, VSBIQ, VSEL, VSTRC, VSTRCB, VSTRCBS, VSTRCF, VSTRCFS, VSTRCH, VSTRCHS, VSTRCZB, VSTRCZBS, VSTRCZF, VSTRCZFS, VSTRCZH, VSTRCZHS, WFMADB, WFMSDB

Z14, in addition:
Normal grouping:
VFMASB, VFMSSB, VFNMA, VFNMADB, VFNMASB, VFNMS, VFNMSDB, VFNMSSB, VMSL, VMSLG, WFMASB, WFMAXB, WFMSSB, WFMSXB, WFNMADB, WFNMASB, WFNMAXB, WFNMSDB, WFNMSSB, WFNMSXB

I have built SPEC on all subtargets from z196 and on, and found:

  • z196 and zEC12 both NFC on SPEC, as expected since all affected instructions group alone.
  • Instructions actually exposed to scheduler (emitted) of the above:

z196: none
zEC12: none
z13:

 10 VMALB
616 VMALF
 10 VMALHW
797 VPERM
424 VSEL

z14:

  10 VMALB
 616 VMALF
  10 VMALHW
1978 VPERM
 432 VSEL
lib/Target/SystemZ/SystemZHazardRecognizer.cpp
123

Yes, the NumOperands are the explicit defs and uses. Then there is in addition the lists of implicit defs and uses.

So, AG has
getNumOperands = 5
MID.hasImplicitDefOfPhysReg(SystemZ::CC) = true
The operands get printed like
DEF[GR64Bit] TIED-USE[GR64Bit] USE[ADDR64Bit] [IMM] USE[ADDR64Bit]
The count of regops = 3 (excluding any tied use).

uweigand accepted this revision.Jul 27 2018, 4:48 AM

Ah, OK. Thanks for checking!

Patch LGTM.

This revision is now accepted and ready to land.Jul 27 2018, 4:48 AM
jonpa updated this revision to Diff 157963.Jul 30 2018, 7:52 AM
jonpa marked an inline comment as done.

TII made available also with NDEBUG.

jonpa closed this revision.Jul 31 2018, 6:02 AM

Committed as r338368.

bjope added a subscriber: bjope.Jul 31 2018, 10:34 AM
bjope added inline comments.
lib/Target/SystemZ/SystemZHazardRecognizer.cpp
104

You probably want to do assert((something) && "") instead of assert((something) || "") here.

jonpa marked an inline comment as done.Jul 31 2018, 1:01 PM
lib/Target/SystemZ/SystemZHazardRecognizer.cpp
104

oops... fixed with r338429. thanks.