This is an archive of the discontinued LLVM Phabricator instance.

[X86][SandyBridge,Haswell] Updating information on each instruction in HSW and SNB about latency, number of uOps and used ports
AbandonedPublic

Authored by gadi.haber on Jun 5 2017, 7:57 AM.

Details

Summary

This patch completely replaces the scheduling information for the SandyBridge and the Haswell architecture targets by completely modifying the files: X86SchedSandyBridge.td and X86SchedHaswell.td located under the X86 Target.
The Haswell and SandyBridge architects have provided us with the correct information about each instruction and I use it to replace existing incorrect instructions scheduling and to add missing instructions in the two files.

Please note that the patch extensively affects the X86 MC instr scheduling.

The updated and extended information about each instruction in HSW and SNB includes the following data:

  • static latency of instruction
  • number of uOps from which the instruction consists of
  • all ports used by the instruction

For example, the following code dictates that instructions, ADC64mr, ADC8mr, SBB64mr, SBB8mr have a static latency of 9 cycles. Each of these instructions is decoded into 6 micro operations whic use ports 4, ports 2 or 3 and port 0 and ports 0 or 1 or 5:

def SBWriteResGroup94 : SchedWriteRes<[SBPort4,SBPort23,SBPort0,SBPort015]> {

let Latency = 9;
let NumMicroOps = 6;
let ResourceCycles = [1,2,2,1];

}
def: InstRW<[SBWriteResGroup94], (instregex "ADC64mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "ADC8mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "SBB64mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "SBB8mr")>;

Diff Detail

Event Timeline

gadi.haber created this revision.Jun 5 2017, 7:57 AM
RKSimon added inline comments.Jun 5 2017, 10:21 AM
test/CodeGen/X86/avx-schedule.ll
19

How come you're not including Haswell memory latency any more?

gadi.haber added inline comments.Jun 6 2017, 1:17 AM
test/CodeGen/X86/avx-schedule.ll
19

Good point. The new modeling is more accurate according to HSW architects - including for memory operations.
Based on my performance measurements it is indeed better although there may be some minor additional changes that I am currently checking.

lsaba added a subscriber: lsaba.Jun 6 2017, 2:33 AM
javed.absar added inline comments.
lib/Target/X86/X86SchedSandyBridge.td
29

An alternative would be to call out those instructions using 'UnSupported'. This will let you set CompleteModel = 1.
Please see - https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AArch64/AArch64SchedFalkor.td#L65

avt77 added a subscriber: avt77.Jun 6 2017, 6:31 AM
avt77 added a comment.Jun 6 2017, 6:36 AM

Why don't you use regular expressions instead of simple list of instructions?

Two reasons why I am not using regular expressions instead of simple list of instructions:

  1. This td file is automatically generated (apart for its header) based on the data I receive from the architects.
  2. In this new version we have many cases of instruction categories that fall into different resource groups - making the regular expression hard to follow.
lib/Target/X86/X86SchedSandyBridge.td
29

Thanx for the tip.
I tried changing this flag to 1 but then I get a list of several thousand unsupported instruction scheduling - even instructions that are not in HSW or SNB.
I plan to add the SkylakeClient scheduling td file soon. Maybe then it will reduce this number to a manageable size.

Updated diff following fixes made to 3 LITs under Codegen X86.

Updated diff to include recent changes made to X86 schedule .td files for horizontal add/sub instructions

zvi edited edge metadata.Jun 26 2017, 4:57 AM

LGTM, but maybe wait for another ok before committing.

rebase with latest LLVM version fro June 27 2017

This revision was automatically updated to reflect the committed changes.

While the work here is pretty great, you were asked to wait for someone else to LGTM and you didn't. That's really poor form. There are plenty of issues here that really should be addressed before this was landed. Please revert and start looking at them.

  1. The commit message is very poorly formated and doesn't actually describe the kinds of differences that people should expect.
  1. There are *numerous* substantial changes to the tests that are completely unrelated to this patch. For example, see the 'half.ll' test case. Please update the FileCheck lines for tests you need to modify in a NFC patch that *just* gives you a clean baseline so that the only thing changing here are the scheduling differences. this is especially important considering how many tests are changed and in how many ways.
  1. This gives a very significant new influx of data to the scheduler. I'm actually really excited about that. However, even looking at thet test cases updated, I see substantially different instruction sequences. What testing have you done to ensure this is correct? We have at least one internal user (based on Halide) that sees correctness regressions after this commit. It may not be this commit that has the bug, but if we have latent missing constraints on instructions (which we probably do) massive changes to the scheduler are likely to expose bugs. It may be worth discussing in the change what testing has been done and ask others to help test such a large change as this.

Thanks,
-Chandler

Folks, this patch *didn't even go to llvm-commits*. So it got *NO* public code review by the broader community.

Please, *do not do this*. Folks at Intel, please check patches for stuff like this before giving LGTMs. I'm going to go reply to the commit itself so that the rest of the community is aware of what is going on here.

gadi.haber reopened this revision.Jun 28 2017, 3:52 AM

Following Chandler's comments I am reverting the commit to address the issues reported in the modified tests.

gadi.haber edited the summary of this revision. (Show Details)Jun 28 2017, 5:10 AM

Dear Chandler.

Following your comments here is what was done:

  1. I extended the details in the summary about the changes. Please let me know if this is what you had in mind. I moved you from subscriber to a reviewer and would be happy to get your comments.
  2. I will go over the problematic tests mentioned. I reverted the commit in order to go over the changes made to the tests.
  3. We ran our internal X86 tests on the patch and they all passed successfully. We will revisit them to see if they should be extended.

Best regards:

Gadi.

gadi.haber edited the summary of this revision. (Show Details)Jun 28 2017, 5:54 AM
gadi.haber abandoned this revision.Jun 28 2017, 11:15 PM