This is an archive of the discontinued LLVM Phabricator instance.

[X86][SKL] Updated scheduling information for the SkylakeClient target
ClosedPublic

Authored by gadi.haber on Oct 10 2017, 6:34 AM.

Details

Summary

Updated the scheduling information for the SkylakeClient target with the following changes:

  1. regrouped the instructions after adding load and store latencies.
  2. regrouped the instructions after adding identified missing ports in several groups.

The changes were made after revisiting the latencies impact of all the load and store uOps.

Diff Detail

Repository
rL LLVM

Event Timeline

gadi.haber created this revision.Oct 10 2017, 6:34 AM
RKSimon added inline comments.Oct 10 2017, 12:56 PM
test/CodeGen/X86/bmi2-schedule.ll
5 ↗(On Diff #118373)

Remove COMMON from SKYLAKE (or just remove COMMON entirely)

Updated diff following Simon's comment to remove the COMMON check prefix from the bmi2 scheduling test.

gadi.haber marked an inline comment as done.Oct 11 2017, 2:29 AM
gadi.haber added inline comments.
test/CodeGen/X86/bmi2-schedule.ll
5 ↗(On Diff #118373)

Removed all COMMON check prefixes.
Simon, please note that next patch I am preparing (before fixing the HSW scheduling) is for the BDW target and we may need to extend these tests to include it.

Would it be possible to preprocess the *.td with a compacting script that make use of regular expression feature of instregex e.g. (instregex "MMX_*") instead listing everything.

gadi.haber marked an inline comment as done.Oct 11 2017, 4:57 AM

The .td file is actually generated by a script and if you notice it already contains some (not many) regular expressions when possible,
For the SKX scheduling, for example, there are many regular expressions that include the broadcast, mask and zeroing bits for all relevant AVX512 instructions.
The problem is that there are not many opportunities to group instructions into regular expressions.
For example, the MMX_* is spread between groups 1, 2,3,8,9,12, etc.
The differences between the groups could be in any of the latency, number of uOps or the ports used by the uOPs.
This makes it hard to use regular expressions.

This modified version of the SkylakeClient schedulings does indeed show gains on SKL without apparent regressions when compared to the existing SKL schedulings.
Thus, we are motivated to continue with the process of updating all X86 target schedulings.
Next step, once this patch is committed, is to submit the Broadwell schedulings for review.

RKSimon accepted this revision.Oct 16 2017, 10:57 AM

LGTM. I'm intending to add MMX scheduling tests shortly so if they land before this you may need to rebase + regenerate.

This revision is now accepted and ready to land.Oct 16 2017, 10:57 AM
This revision was automatically updated to reflect the committed changes.